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eliminated  traditional  back  sample  inflation  due  to  sampling  error.  In  one 
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should  be  implemented  to  permit  effective  personnel  classification. 
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FOREWORD 


This  report  is  one  of  a  series  of  research  efforts  designed 
to  improve  the  selection  and  classification  efficiency  of  the 
Armed  Services  Vocational  Aptitude  Battery  (ASVAB) .  The  research 
reported  is  unique  in  that  it  contributes  to  both  methodological 
issues  in  personnel  assignment  theory  and  to  the  formulation  of 
new  job-matching  policies  based  on  scientific  principles.  It  is 
an  example  of  how  basic  research  can  stimulate  and  provide  direc¬ 
tion  to  applied  research. 

Two  important  general  conclusions  can  be  drawn  from  the 
findings.  First,  we  see  a  higher  classification  efficiency  in¬ 
herent  in  the  ASVAB  than  is  usually  posited.  Second,  the  exist¬ 
ing  operational  assignment  composites  could  be  reconstituted  to 
substantially  improve  classification  efficiency  by  considering 
the  expansion  of  the  number  of  job  families,  by  clustering  jobs 
into  classification-efficient  job  families,  and  by  using  assign¬ 
ment  variables  of  least  squares  estimates  of  performance  based  on 
all  variables  in  the  operational  test  battery. 

Such  a  major  reconstitution  of  job  families  in  the  Army's 
classification  systems  must  be  based  on  all  available  validity 
data  as  well  as  on  information  available  from  job  analyses.  A 
number  of  personnel  classification  and  assignment  policy  issues 
also  must  be  resolved  before  a  new  system  incorporating  differ¬ 
ential  assignment  concepts  and  principles  can  be  implemented. 

The  results  of  this  research,  however,  should  eventually  lead  to 
very  substantial  gains  in  classification  efficiency. 


EDGAR  M.  JOHNSON 
Technical  Director 
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IMPROVING  CLASSIFICATION  EFFICIENCY 
BY  RESTRUCTURING  ARMY  JOB  FAMILIES 

SUMMARY 


A.  Introduction 

The  present  selection  and  assignment  model  sampling  experiment  is  a  basic  research 
effort  that  contributes  to  the  practical  body  of  knowledge  essential  to  both  a  formulation  of 
personnel  job  matching  policies  based  on  scientific  principles  and  the  design  of  research  to 
provide  effective  techniques  and  tools  for  the  implementation  of  these  policies.  The  findings  of 
this  experiment  are  organized  and  interpreted  in  the  context  of  differential  assignment  theory 
(DAT). 

Our  knowledge  of  DAT  derives  from  the  results  provided  by  psychometric  theory, 
modeling,  and  simulations  of  personnel  selection  and  classification  processes  across  a  broad  area 
of  topics  that  includes  specifying  and  evaluating  (1)  personnel  measures  for  inclusion  in 
experimental  and  operational  batteries;  (2)  selection  and  assignment  variables  such  as  aptitude 
areas  (AAs);  (3)  selection  and  assignment  strategies  and  algorithms;  and  (4)  sets  of  job  families 
corresponding  to  the  assignment  variables.  This  study  focuses  on  the  latter,  more  specifically, 
on  the  gains  in  mean  predicted  performance  (MPP)  obtainable  from  a  reconstitution  of  Army 
jobs  into  more  numerous  and  more  classification-efficient  sets  of  job  families  for  use  in  the 
classification  process.  As  a  result  of  the  findings  of  this  study  DAT  is  extended  and  refined, 
and  the  immediate  operational  implications  for  the  Army  classification  system  become  evident. 

In  this  summary,  we  emphasize  the  practical  findings  derived  from  the  model  sampling 
experiment  described  more  completely  in  the  body  of  the  report.  As  noted,  the  results  of  this 
study  have  immediate  implications  for  policy  makers.  When  these  results  are  considered  in  the 
broader  context  of  DAT,  they  point  the  way  for  immediately  effecting  major  improvements  in 
the  personnel  classification  system  and  a  longer  range  redesign  of  the  personnel  classification 
system  to  maximize  classification  efficiency.  This  complete  redesign  should  not  be  completed 
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until  further  validity  data  becomes  available.  Expected  results  have  the  potential  of  improving 
the  Army’s  annual  productivity  by  an  amount  that  would  cost  the  Army  hundreds  of  millions  of 
dollars  each  year  if  achieved  by  using  alternative  approaches  such  as  recruiting  a  greater 
proportion  of  high-quality  personnel  or  the  use  of  longer  and/or  more  intense  training  programs. 

B.  Operational  Issues 

For  more  than  a  decade,  there  have  been  a  number  of  advocates  calling  for  the  reduction 
of  the  number  of  job  families  used  by  the  Army  in  its  classification  system.  These  advocates 
frequently  pointed  out  that  there  are  no  more  than  four  strong  content  clusters  (i.e.,  group 
factors)  in  the  test  content  of  the  ASVAB  and  that  four  job  families  corresponding  to  the  Air 
Force’s  four  job  groupings  would  adequately  reflect  ASVAB  content.  Such  an  argument,  of 
course,  requires  the  equating  of  predictor  dimensionality  with  the  number  of  job  families  to 
which  these  predictors  can  be  used  to  make  reliable  assignments.  Proving  this  argument  to  be 
fallacious  is  a  major  objective  of  this  study. 

We  argue  that  mean  predicted  performance  (MPP)  is  the  figure  of  merit  most  appropriate 
for  comparing  the  benefits  obtainable  from  the  implementation  of  alternative  system  designs  and 
operational  strategies  for  selecting  and  assigning  personnel.  Unfortunately,  many  investigators 
prefer  to  use  predictive  validity  as  the  measure  of  classification  efficiency.  They  define 
classification  efficiency  in  terms  of  the  effect  that  proposed  changes  have  on  the  validities  of 
assignment  variables  for  performance  in  jobs  within  their  associated  job  families. 

Investigators  that  rely  on  predictive  validity  as  the  measure  of  classification  efficiency 
are  typically  quite  pessimistic  about  the  value  or  utility  of  personnel  classification.  They  appear 
to  be  greatly  influenced  by  the  degree  of  unidimensionality  in  the  predictor  space  and  the 
undeniably  dominant  contribution  that  the  largest  principal  component  (PC)  factor  makes  to  both 
the  predictor  intercorrelations  and  validities.  Thus,  they  assert  that  the  dominance  of  the  first 
(largest)  PC  factor  prevents  the  realization  of  significant  classification  effects.  These  advocates 
also  are  typically  impressed  with  the  lack  of  stability  in  regression  weights  when  used  in 
independent  samples.  Much  of  this  pessimism  results  directly  from  the  use  of  predictive  validity 
as  the  measure  of  classification  efficiency. 
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The  present  research  uses  MPP  as  the  measure  of  classification  efficiency  and  permits 
effects  of  both  dimensionality  and  instability  of  regression  weights  to  appropriately  affect 
measures  of  classification  efficiency.  The  results  remain  entirely  free  of  the  effect  of  all  sample 
error  and  biases  in  one  experiment  (Design  A)  and  are  essentially  free  of  all  biases  that  might 
affect  the  comparisons  of  the  primary  conditions  in  the  second  experiment  (Design  B).  A  cross 
validation  design  is  used  in  both. 

Factors  relevaj  t  to  the  design  of  an  optimal  classification  component  of  personnel  systems 
are  investigated  in  this  study.  These  factors  can  be  summarized  as  follows: 

1.  Number  of  job  families  and  corresponding  assignment  variables,  (e.g.,  the  Air  Force 
has  4  composites  or  assignment  variables,  the  Army  has  9,  the  Navy  has  11  and  the 
Marines  have  5).  DAT  recommends  as  many  as  can  be  provided  stable  weights  for  the 
assignment  variables  (AVs)  by  the  available  validity  data. 

2.  Alternative  methods  for  forming  job  families. 

3.  Alternative  methods  for  constituting  AVs. 

4.  The  effect  of  using  a  more  economical  criterion  variable,  (e.g.,  use  of  the  Skill 
Qualifications  Test,  SQT,  to  determine  job  family  structure  and  itc  jse  as  the 
dependent  variable  for  computing  "best*  weights  for  the  formulation  of  assignment 
variables). 

5.  Size  and  heterogeneity  of  the  test  battery  from  which  the  assignment  variables  are 
formed. 

6.  Size  of  analysis  samples  required  to  form  assignment  variables,  (e.g.,  by  computing 
"best’  weights  for  the  tests  in  a  test  composite). 

In  the  present  study,  emphasis  is  on  the  first  two  of  the  six  factors  outlined  above.  The 
remaining  four  factors  are  introduced,  in  a  less  complete  fashion,  to  provide  a  contextual  basis 
of  determining  practical  interactions  with  the  two  primary  factors.  Mean  predicted  performance 
(MPP)  is  used  to  compare  levels  within  and  across  these  factors. 
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C.  Research  Approach 

It  is  possible  to  conduct  a  study  of  this  type  either  by  drawing  samples  from  a  large  data 
bank  of  empirical  test  scores  or  by  generating  sets  of  synthetic  scores  which  have  the  statistical 
properties  of  empirical  scores,  including  their  expected  intercorrelations,  validities,  means, 
variances,  and  shape  of  their  score  distributions.  In  either  case,  the  assignment  and  classification 
processes  associated  with  each  alternative  policy  being  investigated  must  be  simulated  and  mean 
predicted  performance  computed  at  the  conclusion  of  each  simulation. 

We  chose  to  conduct  the  present  study  by  generating  sets  (vectors)  of  synthetic  scores 
separately  based  on  the  data  from  multiple  jobs,  18  and  60  respectively,  provided  by  two  major 
Project  A  empirical  studies.  Each  empirical  data  set  is  corrected  for  restriction  in  range  due  to 
selection  effects  and  the  criterion  variables  corrected  for  unreliability.  The  corrected  predictor 
covariances  and  validities  are  then  used  to  represent  the  two  separate  designated  populations 
from  which  synthetic  scores  are  drawn.  The  Design  A  experiment  uses  Project  A  concurrent 
study  data  which  provides  the  covariances  of  29  predictors  and  validities  for  18  MOS  provides. 
These  18  empirical  samples  provide  the  parameters  to  define  the  designated  population  for 
Design  A. 

Covariances  among  ASVAB  tests  and  validities  of  these  tests  against  SQT  scores  for  60 
MOS  were  selected  from  a  Project  A  data  bank,  corrected  for  restriction  in  range  and 
attenuation,  and  used  to  compute  the  parameters  to  define  the  designated  population  for  Design 
B.  Both  designated  populations  are  assumed  to  represent  the  same  youth  population. 

We  refer  to  the  simulation  of  personnel  system  processes  using  synthetic  scores  as  model 
sampling.  Our  use  of  model  sampling  has  several  major  advantages  over  the  use  of  empirical 
scores  to  conduct  system  simulations.  For  example,  model  sampling  permits  the  generation  of 
as  many  independent  samples  as  desired  from  the  population  from  which  recruiting  and  selection 
is  accomplished,  and  thus  allows  the  use  of  a  research  design  that  controls  or  measures  the 
effects  of  different  sources  of  sampling  error  or  bias. 

In  Design  A,  the  designated  population  is  used  to  generate:  (1)  an  analysis  sample  with 
the  same  number  of  entities  in  each  MOS  as  is  present  in  the  empirical  data  set  used  to  define 
the  designated  population,  and  (2)  20  independent  cross  samples  for  use  in  the  simulations.  Each 


S-4 


cross  sample  is  used  separately  for  each  condition  in  a  repeated  measures  design.  The  analysis 
sample  is  used  in  applying  the  empirical  job  clustering  method  to  form  job  families  and  in 
computing  the  "best"  weights  to  be  applied  to  cross  sample  scores  to  form  predicted  performance 
measures  (FLS  composites)  for  use  as  AVs  in  the  simulations.  Weights  from  the  designated 
population  are  applied  to  cross  sample  scores  at  the  completion  of  each  simulation  to  obtain  the 
MPP  standard  scores  used  as  measures  of  classification  efficiency. 

The  designated  population  of  Design  B  serves  as  the  source  of  weights  for  both 
assignment  and  evaluation  variables.  The  assignment  variables  represent  predicted  performance 
within  job  families  while  the  evaluation  variables  are  the  predicted  performance  measures 
separately  computed  for  each  job.  After  optimal  assignment  to  a  job  family  each  entity  is 
randomly  assigned  to  a  job  within  that  family  and  the  entity’s  predicted  performance  score 
computed.  While  scores  for  all  evaluation  variables  are  computed  using  weights  computed  on 
independent  samples,  avoiding  traditional  back  sample  inflation,  the  less  well  known  effect  of 
correlated  error  across  assignment  and  evaluation  variables  was  not  eliminated  in  Design  B  as 
it  was  in  Design  A.  Since  psychometricians  lack  experience  in  the  effects  of  this  kind  of  bias, 
we  avoid  making  the  kind  of  experimental  comparisons  in  Design  B  which  would  be  most 
affected  by  its  presence.  We  do  not,  for  example,  contrast  the  classification  efficiency  of  a 
priori  and  empirically  determined  weights  for  the  test  composites  making  up  the  assignment 
variables  of  Design  B. 

D.  Major  Findings 
l.  Design  A 

It  is  unfortunate  that  in  this  experiment,  for  Design  A,  we  have  the  best  criterion 
variables,  but  we  also  have  only  18  jobs.  This  limitation  severely  limits  what  we  can  determine 
in  this  experiment  and  is  the  reason  why  we  also  provided  for  Design  B  where  60  MOS  could 
be  utilized.  However,  some  of  the  most  important  conclusions  of  this  study  are  drawn  from 
Design  A  where  we  have  both  the  more  credible  criterion  variables  and  a  more  complete  control 
of  correlated  error  and  biases. 
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In  each  simulation  of  both  Design  A  and  Design  B,  we  first  reject  25  percent  of  the 
entities  of  each  sample  on  the  basis  of  their  AFQT  scores.  We  then  optimally  assign  the  entities 
to  job  families.  All  MPP  standard  scores  reported  in  this  summary  give  the  expected  MPP 
standard  score  after  the  results  of  selection  were  subtracted  from  the  total  MPP  standard  score 
obtained  as  a  result  of  simulating  both  selection  and  optimal  assignment.  For  our  baseline 
condition  in  Design  A,  we  distribute  the  18  jobs  into  the  current  9  operational  job  families  and 
use  the  existing  aptitude  area  (AA)  scores  as  the  assignment  variables. 

Making  selection  and  assignment  decisions  by  chance  yields  an  MPP  standard  score  equal 
to  zero.  Selecting  75  percent  of  the  entities  as  having  the  highest  AFQT  scores,  provides  an 
expected  MPP  of  .225  for  Design  A  under  the  hypothetical  condition  of  random  assignment  to 
jobs.  Using  the  operational  AAs  and  job  families  in  conjunction  with  an  optimal  assignment 
algorithm  adds  only  .092  to  the  MPP  standard  score.  As  noted  above,  we  use  this  condition  in 
Design  A  as  our  baseline  against  which  to  examine  the  gains  obtainable  from  adding 
improvements  by  stages;  the  percentage  improvement  over  both  the  baseline  condition  and  the 
previous  stage  is  given  at  each  stage. 

In  stage  one,  we  substitute  9  least  square  weighted  composites  based  on  the  full  ASVAB 
(FLS-ASVAB  composites)  for  the  9  operational  aptitude  area  composites.  This  yields  an  MPP 
attributable  to  classification  effects  of  .214,  an  increase  over  baseline  of  133  percent. 

In  stage  two,  we  substitute  the  9  classification-efficient  job  families  for  the  9  operational 
job  families  while  using  the  corresponding  FLS-ASVAB  composites  as  assignment  variables. 
This  provides  an  MPP  that  is  greater  than  that  provided  by  selection  (MPP  =  .245),  a 
percentage  increase  over  baseline  of  166  percent,  and  a  gain  over  stage  one  of  14.5  percent. 

For  stage  three,  we  increase  the  job  families  from  9  to  12  while  still  using  corresponding 
FLS-ASVAB  composites  as  assignment  variables.  This  change  provides  an  MPP  due  to 
classification  of  .277,  an  increase  of  201  percent  over  baseline  and  13  percent  over  stage  two. 

Stage  four  involves  the  substitution  of  the  29  Project  A  concurrent  validation 
experimental  variables  for  the  9  ASVAB  tests  in  the  computation  of  the  corresponding  FLS 
composites  —  providing  a  measure  of  the  upper  limit  of  the  gain  in  MPP  obtainable  from  the 
optimal  use  of  the  Project  A  experimental  predictors  to  expand  the  dimensionality  of  the 
operational  classification  battery.  The  use  of  these  FLS -experi mental  composites  for  making 
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optimal  assignments  to  the  12  classification-efficient  job  families  provides  an  MPP  due  to 
classification  of  .367,  an  increase  of  299  percent  over  baseline  and  32.5  percent  over  the  MPP 
obtained  in  stage  three. 

The  reduction  in  the  number  of  job  families  from  9  to  6  provides  a  reduction  in  MPP  to 
.191  when  the  FLS-ASVAB  are  used  as  AVs;  this  is  a  22  percent  reduction  when  compared  to 
the  stage  two  results.  A  reduction  of  24  percent  results  if  FLS -experimental  composites  are  used 
instead  of  FLS-ASVAB  composites  in  a  parallel  comparison  of  assignment  to  6  classification- 
efficient  job  families  as  compared  to  the  use  of  9  classification-efficient  job  families  for  this 
purpose. 

2.  Design  B 

The  60  MOS  for  which  Skill  Qualification  Test  (SQT)  scores  are  available  permit  the 
clustering  of  jobs  into  three  sets  of  a  priori  job  families  as  follows:  (1)  the  9  operational  job 
families  used  by  the  Army  for  initial  classification  and  assignment;  (2)  23  of  the  Army’s  35 
career  management  fields  (CMFs);  (3)  an  intermediate  set  of  16  families  based  on  a  compromise 
between  the  two  sets  of  a  priori  clustering  concepts.  An  empirical  classification-efficient 
clustering  algorithm  was  used  to  provide  parallel  sets  of  9,  16  and  23  job  families.  MPP  is 
computed  after  all  of  the  entities  are  optimally  assigned  to  a  job  family  within  one  of  the  six  sets 
of  job  families.  The  FLS-ASVAB  composites  are  used  as  assignment  variables  for  making 
optimal  assignments  to  job  families  within  each  of  the  six  sets. 

The  Design  B  baseline  is  provided  by  FLS-ASVAB  composites  using  the  60  jobs  formed 
into  the  9  operational  job  families.  This  results  in  an  MPP  standard  score  of .  135.  The  use  of 
16  a  priori  job  families  results  in  an  MPP  of  .258,  a  91  percent  improvement.  An  increase  to 
23  CMF  job  families  results  in  an  MPP  of  .297,  an  improvement  of  120  percent.  Similarly, 
increasing  the  number  of  empirically  determined  classification-efficient  job  families  from  9  to 
16  improves  MPP  by  24.4  percent,  and  an  increase  from  9  to  23  job  families  in  the  classification 
system  provides  an  improvement  of  40.6  percent.  The  above  results,  plus  those  obtained  from 
an  increase  from  16  to  23  job  families,  are  provided  in  Table  S-l. 

The  substitution  of  the  empirically  determined  job  families  for  the  a  priori  job  families 
increases  MPP  by  97  percent  when  there  are  9  job  families,  28  percent  when  there  are  16  job 
families,  and  26  percent  when  there  are  23  job  families.  The  total  gain  in  MPP  achieved  from 
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Table  S-l 


COMPARISON  OF  DIFFERENCES  AND  PERCENTAGE  GAINS  IN 
MPP  USING  SQT  AS  THE  CRITERION  FOR  60  JOBS 

Number  of  Job  Families 


Empirical 


Operational 


Increase  from: 

Difference 

%  Gain 

Difference 

%  Gain 

9  to  16 

.065 

24.4 

.123 

91.1 

16  to  23 

.043 

13.0 

.039 

15.1 

9  to  23 

.108 

40.6 

.162 

120.0 
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changing  the  structure  of  the  60  MOS  from  9  operational  job  families  to  a  classification-efficient 
set  of  23  job  families  provides  a  gain  of  177  percent,  of  which  120  percent  is  immediately 
obtainable  from  the  increase  in  number  of  job  families  and  the  additional  57  percent  can  then 
be  obtained  from  also  using  the  improved  method  of  structuring  jobs.  If  the  first  change  is  in 
the  method  of  forming  job  families,  the  first  gain  is  97  percent  and  the  second  gain,  from  also 
increasing  job  families  from  9  to  23,  is  80  percent. 

E.  Conclusions  and  Recommendations 

1.  Theoretical  Implications 

The  findings  of  this  study  strongly  support  a  number  of  DAT  principles  including: 

a.  The  largest  immediate  improvement  that  can  be  provided  for  any  personnel 
classification  system  is  the  use  as  assignment  variables  of  least  square  estimates  of 
performance  based  on  all  variables  in  the  operational  test  battery,  that  is,  the  adoption 
of  full  least  square  (FLS)  composites  as  replacements  for  the  present  type  of 
aptitude  area  composites. 

b.  The  optimal  number  of  job  families  for  inclusion  in  an  FLS  composite  based 
personnel  classification  system  is  as  many  families  as  can  be  coupled  with  adequately 
valid  assignment  variables.  The  factor  limiting  the  number  of  job  families  is  the 
availability  of  validity  data  for  the  constituent  jobs  in  the  job  families.  For  example, 
although  there  are  approximately  260  entry-level  Army  jobs,  the  Project  A  database  used 
for  this  study  would  not  be  able  to  provide  even  minimal  validity  data  for  more 

than  about  40  job  families. 

c.  Whenever  it  is  not  feasible  to  provide  separate  FLS  composites  for  each  job,  it  is 
essential  that  jobs  be  clustered  into  job  families  in  a  manner  that  maximizes 
classification  efficiency. 

d.  The  expansion  of  the  dimensionality  of  the  classification  battery  by  the  inclusion  of 
more  predictors  with  greater  heterogeneity  can  be  expected  to  increase  the  potential 
classification  efficiency  to  about  the  same  extent  as  can  be  accomplished  by  the  use  of 
more  classification-efficient  job  families  in  place  of  the  existing  a  priori  job  families. 
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The  principles  given  above  are  very  strongly  supported  by  the  results  of  this  study.  Some 
investigators  have  suggested  contradictory  classification  system  guidelines  based  on  erroneously 
equating  classification  efficiency  to  predictive  validity.  But  when  measurement  of  classification 
efficiency  is  made  in  terms  of  MPP,  computed  after  entities  have  been  optimally  assigned  to 
jobs,  as  in  this  study,  DAT  principles  have  been  consistently  validated. 

2.  Operational  Implications 

Design  A  provides  further  evidence  that  the  operational  AA  test  composites  are  grossly 
inadequate.  At  the  same  time,  data  strongly  suggest  that  the  present  ASVAB  tests  have 
sufficient  multidimensionality  and  differential  validity  to  permit  effective  personnel  classification. 
In  the  present  study,  we  see  that  assignment  variables  derived  from  the  ASVAB  (of  the  type 
recommended  by  DAT)  have  a  133  percent  improvement  over  the  operational  AVs.  The 
additional  classification  efficiency  provided  by  adding  all  20  of  the  Project  A  concurrent 
validation  experimental  variables  to  the  9  existing  ASVAB  tests  to  form  a  new,  much  larger, 
classification  battery  provides  a  further  gain  in  MPP  of  32.5  percent. 

While  the  procedures  used  to  form  the  existing  operational  job  families  are  clearly  not 
optimal,  they  are  much  more  effective  than  are  the  AA  composites  corresponding  to  each  family. 
Most  of  the  potential  increase  in  MPP  obtainable  from  using  more  job  families  is  available  from 
the  use  of  a  priori  job  families  that  meet  other  operational  needs. 

The  primary  technical  report  on  Project  A  (McLaughlin,  Rossmeissl,  Wise,  Brandt,  and 
Wang,  1984)  concludes  that  job  clustering  processes  in  the  context  of  the  same  validity  data  as 
used  for  our  Design  B  lacked  sufficient  stability  to  warrant  confidence  that  any  gains  provided 
would  be  demonstrable  in  independent  samples.  However,  the  emphasis  in  McLaughlin,  et  al. 
(1984)  was  on  the  instability  of  the  regression  weights  for  FLS  composites,  rather  than  on  the 
MPP  achievable  from  the  optimal  assignment  of  entities  in  independent  samples  to  alternative 
job  families.  DAT  favors  the  latter  utility  approach  over  the  use  of  psychometric  indices,  as 
favored  by  McLaughlin  et  al.  (1984),  that  have  no  apparent  connection  to  utility. 

Operational  job  families  should  be  based  on  the  use  of  all  available  information  and  must 
provide  for  all  MOS.  This  study  has  not  attempted  to  make  maximum  use  of  even  the  two  data 
sets  selected  for  use  with  Design  A  and  Design  B,  let  alone  make  use  of  all  of  the  validity 
information  available  to  the  Army.  Thus,  while  we  believe  our  findings  to  be  based  on 
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adequately  representative  data  that  permit  credible  conclusions  regarding  the  utility  of  utilizing 
more  and  better  defined  job  families  in  the  Army  initial  classification  system,  we  do  not 
recommend  the  installation  of  the  specific  job  families  identified  in  this  study.  We  instead 
recommend  the  integration,  through  expert  judgment,  of  the  information  from:  (1)  our  CE  job 
clustering  procedure  as  used  on  Design  A,  Design  B,  and  additional  data  sets;  (2)  CMF 
membership  of  each  MOS;  and,  (3)  classification  family  membership  of  each  MOS.  The  use 
of  such  an  integrated  approach  would  readily  provide  20  to  30  credibly  classification-efficient 
job  families  for  use  in  a  revised  classification  system. 


S-ll 


IMPROVING  CLASSIFICATION  EFFICIENCY  BY 
RESTRUCTURING  ARMY  JOB  FAMILIES 

I.  INTRODUCTION 


A.  Objectives 

The  purpose  of  this  research  is  to  build  upon  the  foundation  of  differential  assignment 
theory  by  examining  the  effects  of  restructuring  Army  job  families  on  potential  classification 
efficiency  (PCE).  Specifically,  this  research  addresses  the  effects  on  PCE  of  (1)  increasing 
the  number  of  job  families;  (2)  employing  different  job  clustering  methods  to  form  job  families; 

(3)  using  full  least  squares  (FLS)  composites  instead  of  aptitude  area  composites  for  assignment; 

(4)  substituting  different  criterion  measures  in  the  joint  predictor-criterion  space;  (5)  increasing 
the  dimensionality  of  the  predictor  space;  and  (6)  computing  the  regression  weights  of  FLS 
composites  on  moderately  sized  analysis  samples  (as  contrasted  to  the  infinitely  large  analysis 
samples  used  in  the  studies  of  Nord  and  Schmitz,  1989,  1991  and  Whetzel,  1991). 

Focusing  on  the  job  family  structure  is  a  promising  approach  to  improving  classification 
efficiency.  In  this  research,  a  new  job  clustering  method  is  proposed  that  minimizes  the 
successive  reduction  in  potential  classification  efficiency  in  the  resulting  job  families.  The  goal 
is  to  provide  a  job  family  clustering  method  that  contributes  to  an  improvement  in  the  ability  to 
classify  individuals  efficiently  and,  thus,  an  increase  in  overall  mean  predicted  performance 
(MPP). 

B.  Theoretical  Background 

The  earliest  and  most  significant  contributions  to  classification  research  come  from  the 
psychometric  theories  of  Hubert  Brogden  and  Paul  Horst  during  the  1940s  and  1950s.  Their 
work  provides  the  theoretical  foundation  for  all  subsequent  research  on  classification.  Building 
upon  the  work  of  these  early  researchers,  Zeidner  and  Johnson  (Johnson  &  Zeidner,  1990, 1991; 
Zeidner,  1987;  Zeidner  &  Johnson,  1989a,  1989b,  1991a,  1991b)  introduced  differential 
assignment  theory  (DAT)  as  part  of  a  revival  of  classification  research  within  the  field  of 
personnel  psychology.  The  following  section  will  provide  a  brief  review  of  the  early  work  of 
Brogden  and  Horst  as  it  relates  to  the  present  research.  In  addition,  the  following  section  will 
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contain  a  discussion  of  some  of  the  key  principles  in  differential  assignment  theory  relevant  to 
this  research. 

1.  Linking  Classification  Efficiency  to  Performance:  Broaden’ s  Allocation  Model 

Hubert  Brogden  is  responsible  for  directly  tying  measurement  of  classification  efficiency 
to  mean  predicted  performance  (MPP)  and  thus  to  the  utility  of  classification.  Brogden  (1946, 
1949;  see  also  Brogden  &  Taylor,  1950)  is  probably  most  well-known  for  his  models  estimating 
the  utility  of  selection  devices.  The  utility  of  a  selection  device  is  the  degree  to  which  its  use 
improves  the  quality  of  the  individuals  selected  beyond  what  would  have  occurred  had  that 
device  not  been  used  (Blum  &  Naylor,  1968).  Brogden  (1949)  used  the  principles  of  linear 
regression  to  demonstrate  how  the  selection  ratio  and  the  standard  deviation  of  job  performance 
in  dollars  affect  the  economic  utility  of  a  selection  device. 

Brogden’ s  concentration  on  the  utility  of  selection  devices  led  naturally  to  the  expression 
of  classification  in  the  same  terms.  In  1959,  Brogden  developed  a  general  allocation  model  in 
which  he  examined  the  efficiency  of  classification  as  a  function  of  the  validity  of  the  estimates 
of  job  performance,  the  degree  of  intercorrelation  of  these  estimates,  and  the  number  of  jobs. 
His  goal  was  to  show  the  effects  of  these  variables  on  productivity  when  classifying  individuals 
to  jobs.  He  demonstrated  that  MPP  =  R(l-r)1/2f(m).  In  this  formula,  R  is  the  average 
predictive  validity  of  the  least  squares  estimates  (LSEs)  of  job  performance,  r  is  the  average 
intercorrelation  among  the  LSEs  of  job  performance,  and  f(m)  is  an  order  function  which  reflects 
the  effects  of  increasing  the  number  of  jobs  (m)  or  job  families  on  classification  efficiency. 

From  this  formulation,  it  is  apparent  that  when  R  and  f(m)  increase,  MPP  also  increases. 
However,  note  that  the  lower  the  intercorrelation  among  the  LSEs,  r,  the  greater  the  MPP.  In 
practice,  it  is  not  unusual  to  have  fairly  high  intercorrelations  among  the  LSEs.  The  significance 
of  Brogden’s  finding  is  that  even  when  the  intercorrelations  among  the  estimates  are  high, 
considerable  classification  efficiency  remains.  As  Brogden  points  out,  even  with 
intercorrelations  of  .80,  classification  gains  are  45%  as  great  as  with  intercorrelations  of  zero. 
Nord  and  Schmitz  (1989, 1991)  found  in  their  empirical  study  that  even  with  an  average  r  of  .95 
among  the  predicted  performance  LSEs,  they  were  able  to  obtain  considerably  greater  MPP 
when  LSEs  were  used  for  assignment  compared  to  when  the  U.S.  Army’s  operational  composites 
were  used  for  assignment. 
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The  significance  of  Brogden’s  formulation  to  the  present  research  is  that  the  predictive 
validity,  R,  and  the  average  intercorrelation  among  the  LSEs,  r,  are  both  affected  by  increasing 
the  number  of  job  families.  Increasing  the  number  of  job  families  in  a  classification-efficient 
manner  affects  validity,  R,  because  it  results  in  more  homogeneous  jobs  being  placed  together 
to  be  predicted  by  a  single  LSE.  With  more  homogeneous  job  families,  more  precise 
classification  of  individuals  into  those  families  is  possible.  This  more  reliable  and  precise 
prediction  capability  results  in  an  increase  in  validity  (R).  Increasing  the  number  of  job  families 
in  a  classification-efficient  manner  affects  the  intercorrelation,  r,  among  the  LSEs  because  it 
results  in  a  greater  uniqueness  in  the  job  families.  Thus,  it  is  possible  to  capitalize  on  the 
differences  among  the  job  families  resulting  in  a  decrease  in  the  average  intercorrelation  among 
the  LSEs. 

However,  Brogden  (1959)  also  demonstrated,  through  the  order  function  f(m),  that  even 
if  R  and  r  are  held  constant,  increasing  the  number  of  jobs  will  increase  classification  efficiency. 
This  effect  is  analogous  to  the  effect  that  the  selection  ratio  has  on  the  selection  process.  For 
the  selection  ratio,  as  the  number  of  applicants  increase  or  the  number  of  available  vacancies 
decrease,  more  selectivity  into  these  vacancies  is  possible  resulting  in  an  increase  in  predicted 
performance.  Similarly,  as  the  number  of  jobs  or  job  families  increase,  it  is  possible  to  more 
precisely  assign  individuals  to  the  jobs  or  job  families  by  capitalizing  on  intra-individual 
differences.  This  greater  precision  in  assignment  would  also  result  in  an  increase  in  predicted 
performance. 

Brogden  (1959)  made  a  number  of  simplifying  assumptions  in  order  to  mathematically 
demonstrate  the  relationships  just  discussed.  The  present  research  provides  a  more  realistic, 
empirical  test  of  these  relationships.  As  the  number  of  job  families  increases,  validity  should 
increase  and  the  intercorrelation  among  the  LSEs  should  decrease.  These  effects  should  be 
manifested  by  an  increase  in  MPP  after  optimal  assignment  to  jobs. 

Z-  Eoisfa-Diffgrgntial  Validity  Inde* 

Paul  Horst  (1954)  is  the  primary  contributor  to  the  theory  and  methodology  underlying 
the  design  of  classification-efficient  test  batteries.  The  most  classification-efficient  test  battery 
is  one  with  the  greatest  differential  validity.  Differential  validity  represents  the  ability  of  a  test 
to  forecast  differences  in  performance  in  different  jobs  (Cascio,  1991).  A  simple  example  of 
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the  concept  of  differential  validity  can  be  illustrated  through  a  two-job  classification  problem. 
For  two  jobs,  A  and  B,  one  test  would  be  selected  for  inclusion  in  a  classification-efficient 
battery  that  had  a  high  correlation  with  performance  on  job  A  and  a  low  (or  preferably  negative) 
correlation  with  performance  on  job  B.  Then,  another  test  would  be  selected  that  had  a  high 
correlation  with  performance  on  job  B  and  not  on  job  A.  The  resulting  battery  would  be  one 
with  high  differential  validity.  The  goal  is  to  be  able  to  predict  an  individual’s  relative  fitness 
for  job  A  over  job  B  or  vice  versa. 

Thus,  in  order  to  develop  classification-efficient  test  batteries,  Horst  (1954)  needed  to 
first  define  an  index  of  differential  validity  to  be  used  for  much  more  complex,  realistic  test 
development.  Horst’s  differential  index,  Hd,  can  most  generally  be  stated  as  the  sum  of  the 
squared  correlations  between  the  difference  of  each  pair  of  criterion  scores  and  the 
corresponding  pair  of  differences  between  the  best  weighted  predictors  of  each  criterion.  Note 
that  in  order  to  compute  a  difference  between  each  pair  of  criterion  measures  and  the  best 
predictor  of  each  difference,  it  is  necessary  to  have  criterion  measures  for  each  person  on  each 
job.  Since  this  is  never  possible  in  actual  practice,  Horst  (1954)  stipulates  that  predicted  criteria 
based  on  the  "least-square"  estimates  from  the  test  battery  be  substituted  for  the  unobtainable 
actual  criterion  measures.  This  theorem  is  a  key  assumption  underlying  classification  research 
since  without  it  evaluating  the  efficiency  of  various  classification  batteries  and  classification 
procedures  would  not  be  possible. 

Brogden  (1955)  provided  a  rigorous  proof  of  this  theorem  showing  that,  for  any 
assignment  to  jobs,  the  sum  of  the  multiple  regression  criterion  estimates  will  equal  the  sum  of 
the  actual  criterion  scores.  This  theorem  holds  because  the  actual  criterion  components  that  are 
orthogonal  to  the  joint  predictor-criterion  space  are  totally  irrelevant  to  either  the  implementation 
of  a  selection/classification  process,  or  to  the  measurement  of  process  efficiency.  The  only 
criterion  components  that  are  relevant  are  within  the  joint  predictor-criterion  space,  and  the 
correlation  of  predicted  performance  with  actual  performance  is  unity  when  computed  in  the  joint 
predictor-criterion  space.  When  both  the  predictors  and  the  predicted  criteria  are  the  least 
square  estimates  (LSEs),  Horst’s  index  simplifies  to  the  average  squared  difference  between  each 
pair  of  criterion  measures. 
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Horst’s  differential  index,  Hd,  plays  a  key  role  in  the  present  research  because  it  is  Hi 
that  is  maximized  in  the  classification-efficient  job  clustering  algorithm  developed  for  this 
research.  Thus,  although  Hd  is  typically  used  for  selecting  the  most  classification-efficient  tests 
for  a  test  battery,  the  present  research  is  designed  to  demonstrate  that  H  can  also  be  used  in 
forming  classification-efficient  job  families. 

The  purpose  in  forming  classification-efficient  job  families  with  the  use  of  H  is  to 
provide  an  increase  in  MPP.  Horst  (1954)  was  simply  defining  a  psychometric  index  and 
provides  no  link  to  the  measurement  of  MPP.  However,  it  has  been  demonstrated  that  Horst’s 
differential  index  can  be  directly  linked  to  MPP,  and  thus  to  utility,  through  its  relationship  to 
Brogden’s  measure  of  classification  efficiency  (Johnson  &  Zeidner,  1990,  1991). 

Brogden’s  1959  model  is  based  on  a  set  of  assumptions  regarding  the  relationships  among 
and  across  predictor  and  criterion  variables  (Johnson  &  Zeidner,  1990,  1991).  These 
relationships  can  be  depicted  in  terms  of  Spearman’s  Two  Factor  theory.  Brogden’s  assumptions 
are  met  if:  (a)  the  factor  matrix,  Fv,  is  a  matrix  such  that  FVFV’  is  equal  to  C„  (the  covariances 
among  predicted  performance  scores),  (b)  all  elements  of  the  first  general  factor  (the  g  factor) 
from  Fv  are  equal  to  the  product  R(r)1/2,  and  (c)  the  remaining  factors  (specific  unique  factors) 
from  Fv  can  be  expressed  as  a  diagonal  matrix  with  the  diagonal  elements  equal  to  R(l-r),/2.  It 
is  possible  to  show  a  link  between  Brogden’s  model  and  Horst’s  differential  validity  index 
because  Horst’s  H  is  equal  to  the  sum  of  the  squared  deviations  from  the  column  means  of  each 
element  of  Fv.  The  sum  of  squared  deviations  for  the  first  column  of  Fv  (the  g  factor)  is  equal 
to  zero,  and  the  sum  of  the  squared  deviations  for  the  remaining  m  columns  of  Fv  (the  unique 
factors)  is  R(l-r).  Thus,  H  is  equal  to  (m-1)  times  R(l-r)  when  Brogden’s  assumptions  are  met. 
Brogden’s  complete  formula  for  mean  predicted  performance  is:  MPP  =  R(l-r),/2f(m). 
Therefore,  when  substituting  Horst’s  index  it  is  only  necessary  to  take  the  square  root  of  Ha, 
divide  by  (m-1),  and  multiply  by  f(m)  to  obtain  MPP  when  Brogden’s  assumptions  are  met. 
Thus,  it  is  reasonable  to  expect,  to  the  extent  that  Brogden’s  model  is  robust  with  respect  to  his 
assumptions,  that  Hd  closely  approximates  MPP.  Even  though  we  know  Brogden’s  assumptions 
are  rarely  met  in  empirical  data,  we  can  still  expect  that  the  utilization  of  a  clustering  method 
that  increases  Hi  will  also  increase  MPP.  Similarly,  other  trends  such  as  the  increase  in  the 
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number  of  job  families  that  result  in  an  increase  in  H^,  can  be  expected  to  provide  a  similar 
increase  in  MPP. 

3.  Concepts  and  Principles  of  Differential  Assignment  Theory 

Differential  assignment  theory  (DAT)  can  be  defined  by  four  organizing  concepts:  (1) 
to  maximize  benefits,  a  set  of  quantitative  principles  must  be  employed  that  embrace  selection 
of  predictors  in  a  battery,  the  structure  of  job  families,  and  the  strategies  and  algorithms  used 
in  the  selection/assignment  process;  (2)  utility  models,  measuring  benefits  in  terms  of  mean 
predicted  performance,  provide  the  best  approach  for  specifying  personnel  selection  policies  and 
procedures  for  operational  systems;  (3)  benefits  for  both  selection  and  classification  procedures 
are  maximized  by  using  the  same  weights  for  a  given  set  of  composites  under  optimal 
conditions,  while  under  non-optimal  conditions,  selection  and  classification  must  be  separately 
considered;  and  (4)  any  multidimensional  selection/classification  strategy  and  algorithm  can  be 
practically  implemented  in  operations  by  utilizing  available  computer  capabilities. 

Several  of  the  key  principles  of  DAT  are  directly  relevant  to  the  current  research.  At 
the  core  of  DAT  is  the  principle  of  multidimensionality  in  the  joint  predictor-criterion  (JP-C) 
spree.  It  is  this  principle  that .  :ves  as  the  theoretical  foundation  of  DAT  with  regard  to  the 
nature  of  human  abilities.  DAT  assumes  a  non-trivial  degree  of  multidimensionality  in  the  joint 
predictor-criterion  space.  This  means  that  differential  assignment  theory  assumes  there  are  other 
factors  besides  the  "g"  factor  (general  cognitive  ability)  that  can  play  a  significant  role  in  the 
selection  and  classification  process.  This  assertion  is  counter  to  the  consensus  established  in 
recent  decades  that  a  general  cognitive  ability  component  is  sufficient  for  predicting  job 
performance  in  all  jobs  (see  the  Special  Issue  of  the  Journal  of  Vocational  Behavior,  1986,  for 
a  collection  of  opinions). 

Indeed,  since  the  advent  of  the  type  of  validity  generalization  (VG)  research  introduced 
by  Frank  Schmidt  and  John  Hunter  in  the  1970s  (Schmidt  &  Hunter,  1977),  there  has  been 
increasing  support  among  measurement  specialists  for  the  sole  use  of  g  for  predicting  job 
performance.  Current  VG  theory,  as  contrasted  with  Mosier’s  (1951)  earlier  concept,  is  founded 
on  the  principle  that  the  g  factor  has  an  overriding  influence  on  performance,  and  it  is  this 
common  element  among  jobs  that  enables  validity  to  be  generalizable  across  different  jobs  and 
situations. 
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DAT  is  enriched  by  broadly  based  VG  concepts  and  findings.  However,  the  VG 
emphasis  on  the  g  factor,  or  on  g  plus  one  or  two  additional  group  factors,  prevalent  among 
strong  proponents  of  VG  theory,  is  not  a  requisite  characteristic  of  DAT.  While  the  theory  is 
not  restricted  to  any  particular  factor  structure,  the  assumption  of  a  non-trivial  degree  of 
multidimensionality  in  the  JP-C  space  is  essential. 

Recent  research  has  shown  that  contrary  to  the  belief  of  many  VG  theorists,  it  is  possible 
to  demonstrate  a  non-trivial  degree  of  multidimensionality  in  the  JP-C  space.  Whetzel  (1991) 
factored  the  predictor-criterion  covariances  of  the  same  U.S.  Army  Project  A  concurrent 
validation  database  used  in  the  present  research.  The  matrix  of  predictor-criterion  covariances 
in  this  database  were  factored  and  rotated  such  that  Horst’s  differential  index  was  maximized 
in  each  successive  factor  (Zeidner  &  Johnson,  1989b,  1991b).  This  factoring  was  done  in  order 
to  identify  the  most  classification-efficient  factors  in  the  joint  predictor-criterion  space  and  to 
identify  representative  jobs  that  loaded  differentially  on  these  factors.  Whetzel  (1991)  found  that 
the  first  factor,  the  g  factor,  accounted  for  79  percent  of  the  variance.  However,  with  the  first 
factor  removed  it  was  determined  that  six  factors  contained  jobs  that  loaded  highly  and 
differentially  on  these  factors  and,  therefore,  yielded  a  classification-efficient  solution.  These 
results  meant  that  there  were  six  non-trivial  dimensions,  besides  the  g  factor,  within  the  joint 
predictor-criterion  space.  For  the  present  research,  it  is  possible  to  examine  the  effects  of 
changing  the  dimensionality  of  the  JP-C  space  in  two  different  ways.  One  way  is  to  compare 
assignment  using  the  standard  Armed  Services  Vocational  Aptitude  Test  Battery  (AS  VAB)  with 
assignment  based  on  the  ASVAB  augmented  by  20  new  experimental  predictors.  The 
experimental  predictors  should  expand  the  dimensionality  of  the  JP-C  thereby  providing  for  more 
efficient  classification.  Another  way  of  expanding  the  joint  predictor-criterion  space  that  will 
be  used  in  the  present  research  is  to  increase  the  number  of  job  families  in  a  classification- 
efficient  manner.  As  the  number  of  job  families  is  increased,  each  job  family  will  become  more 
homogeneous  within  itself  (more  unique  components  and  less  g).  Each  job  family  will  also 
become  more  heterogenous  with  respect  to  other  job  families  if  the  job  families  are  formed  by 
taking  differential  validity  into  account.  In  other  words,  the  idea  is  to  expand  the  joint  space 
by  forming  job  families  that  are  maximally  different  from  one  another. 
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Another  of  the  key  DAT  principles  states  that  the  "best"  selection  and/or  assignment 
variable  for  maximizing  either  selection  or  classification  efficiency  is  a  full  least  squares  (FLS) 
regression  composite.  Note  that  this  is  a  "full"  composite,  meaning  that  all  of  the  tests  in  the 
battery  are  to  be  included.  A  common  misconception  is  that  selected  elimination  of  composites 
in  a  battery  to  reduce  the  intercorrelations  of  test  composites  is  helpful  or  even  necessary  to 
increase  classification  efficiency.  A  set  of  FLS  composites  cannot  be  improved  with  respect  to 
classification  efficiency  by  the  elimination  of  tests  that  measure  only  g,  or  of  any  other  tests  that 
might  reduce  the  intercorrelation  of  test  composites  (Zeidner  &  Johnson,  1989b,  1991b). 

There  have  been  two  empirical  studies  that  have  examined  the  potential  of  an  FLS 
composite  for  maximizing  classification  efficiency.  Sorenson  (1965)  used  simulation  techniques 
to  compare  the  allocation  to  jobs  based  on  full  regression  equations  using  all  tests  of  the  Army 
Classification  Battery  instead  of  allocation  based  on  two-test  aptitude  area  composites.  Sorenson 
(1965)  found  that  the  gain  in  MPP  over  random  assignment  more  than  doubled  by  substituting 
full  regression  equations  for  the  aptitude  areas.  Nord  and  Schmitz  (1989,  1991)  simulated  the 
assignment  of  individuals  to  jobs  in  a  very  similar  way  to  that  used  in  the  present  research. 
However,  they  used  FLS  composites  with  regression  weights  based  in  the  nine  aptitude  area 
composites,  rather  than  directly  on  the  ASVAB  test  scores.  Nord  and  Schmitz  (1989,  1991) 
found  gains  in  MPP  of  over  72%  by  using  FLS  assignment  instead  of  the  current  U.S.  Army 
aptitude  area  composites.  In  the  present  research,  a  condition  has  been  built  into  the  design  that 
allows  for  another  comparison  of  assignment  with  FLS  composites  instead  of  the  current  U.S. 
Army  aptitude  area  composites.  However,  unlike  Nord  and  Schmitz  (1989,  1991)  the  FLS 
equation  is  based  directly  on  the  ASVAB  test  scores  which  should  provide  for  even  greater 
expected  gains  in  MPP. 

Finally,  the  most  relevant  differential  assignment  principle  for  the  present  research  is  the 
principle  which  states  that,  in  general,  increasing  the  number  of  assignment  composites  and 
associated  job  families  adds  to  potential  classification  efficiency.  It  is  important  to  realize  that 
the  magnitude  of  a  gain  in  potential  classification  resulting  from  an  increase  in  the  number  of 
job  families  will  depend  on  the  method  used  to  provide  more  job  families  and  upon  the 
heterogeneity  of  the  jobs  in  the  joint  predictor-criterion  space.  One  of  the  best  ways  of 
restructuring  jobs  in  order  to  increase  potential  classification  efficiency  should  be  to  reconstitute 
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a  total  set  of  jobs  into  classification-efficient  clusters.  It  is  this  last  principle  that  is  most  directly 
examined  in  the  present  research  through  the  development  of  a  classification-efficient  method 
of  job  clustering  and  use  of  a  set  of  conditions  to  demonstrate  the  expected  increase  in  MPP  as 
the  number  of  job  families  increase. 

C.  Alternative  Approaches  for  Improving  PCE 

There  are  many  alternative  approaches  that  could  be  employed  to  bring  about 
improvements  in  the  selection  and  classification  system  of  an  organization  such  as  the  U.S. 
Army.  By  far  the  largest  improvements  in  personnel  classification  efficiency  would  come  from 
the  substitution  of  FLS  composites  for  the  existing  U.S.  Army  aptitude  area  composites. 
However,  there  are  a  number  of  other  promising  changes  that  could  provide  appreciable  amounts 
of  improvement  in  productivity  that,  for  the  most  part,  are  additive  to  the  gains  due  to  the  use 
of  FLS  composites.  Improvements  could  result  from  the  creation  and  use  of:  a  classification- 
efficient  (CE)  test  battery,  more  and  better  job  families,  better  CE  test  composites  as  assignment 
variables,  and  more  effective  assignment  strategies. 

1.  Changing  Test  Battery  Content 

A  set  of  test  composites  can  provide  no  more  PCE  for  a  prescribed  set  of  job  families 
than  was  provided  in  the  test  selection  process  that  created  the  operational  test  battery.  If  it  is 
possible  to  change  the  content  of  the  operational  test  battery,  improvements  in  PCE  could  be 
accomplished  by  selecting  predictors  that  experts  believe  have  a  high  degree  of  differential 
validity  (as  contrasted  with  predictive  validity)  for  inclusion  in  an  experimental  test  pool.  It 
would  then  be  possible  to  perform  test  selection  employing  indices  that  measure  PCE  to  create 
an  operational  battery  with  the  best  PCE. 

Recently,  Johnson,  Zeidner,  and  Scholarios  (1990)  completed  a  study  that  compared 
various  test  selection  indices  in  terms  of  their  potential  for  maximizing  PCE.  From  an 
experimental  test  pool  of  29  tests  (including  the  9  ASVAB  tests),  tests  were  selected  to  create 
FLS  composites  of  five  or  ten  tests.  These  test  batteries  were  then  used  in  the  simulated 
assignment  of  individuals  to  jobs  and  MPP  was  calculated  to  assess  the  efficiency  of  that 
assignment.  Two  of  the  indices  used  to  select  tests  were  Horst’s  differential  index,  Hj,  and 
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Max-PSE  which  is  a  measure  of  selection  efficiency.  Use  of  the  classification-efficient  index, 
Hj,  resulted  in  gains  in  MPP  as  great  as  22%  over  the  use  of  the  selection-efficient  index,  Max- 
PSE.  Additionally,  this  study  showed  gains  in  MPP  of  approximately  25  %  when  the  number 
of  FLS  predictors  was  increased  from  five  to  ten.  Overall,  it  was  concluded  that  classification- 
efficient  methods  of  test  selection  lead  to  greater  MPP  in  an  assigned  group  than  a  selection- 
efficient  method. 

2.  Restructuring  Jobs  into  New  Job  Families 


If  an  operational  test  battery  were  fixed  and  could  not  be  readily  changed,  PCE  could  still 
be  improved  by  efficiently  increasing  the  number  of  job  families  with  their  associated  predictor 
composites.  It  is  estimated  that  an  increase  in  the  number  of  composites  and  associated  job 
families  to  somewhere  between  20  and  40  would  most  likely  provide  the  maximum  efficiency 
for  Army  jobs.  In  the  present  research,  classification-efficient  job  families  will  be  created  using 
Hd  that  can  be  compared  to  job  families  formed  using  a  selection-efficient  method.  These 
empirical  methods  of  forming  job  families  will  also  be  compared  to  the  job  family  structures 
currently  used  by  the  U.S.  Army. 

3.  Changing  Assignment  Variables 

The  most  important  change  in  assignment  variables  that  could  be  adopted  by  the  Army 
would  be  the  conversion  of  the  existing  aptitude  area  test  composites  into  least  squares  estimates 
based  on  all  tests  in  the  classification  battery,  i.e.,  using  predicted  performance  as  the  basis  of 
assignment  rather  than  test  composites.  These  full  least  squares  (FLS)  composites  are  optimal 
for  both  selection  and  classification  of  personnel. 

The  use  of  numerous  test  composites  would  require  the  Army  to  record  many  scores  on 
each  soldier’s  official  record.  One  way  to  use  many  assignment  composites  would  be  to  install 
a  two-tiered  system  in  which  the  large  number  of  FLS  composites  are  used  to  make 
recommendations  regarding  assignment,  while  a  much  smaller  number  of  factor  scores  are  used 
for  counseling.  These  factor  scores  would  also  be  used  as  a  basis  for  setting  minimum  cutting 
scores  for  entry  into  special  training  programs,  as  a  career  planning  aid  to  be  available  to  the 
soldier,  and  for  other  personnel  management  purposes,  such  as  retention  and  promotion.  A 
study  is  currently  underway  to  assess  the  amount  of  PCE  that  can  be  provided  by  a  small 


number  of  factor  scores.  This  study  is  designed  to  compare  the  PCE  provided  by  a  number  of 
different  types  of  assignment  composites. 

4.  Changing  Selection-Assignment  Strategies 

Improvements  in  PCE  could  be  made  through  the  consideration  of  different 
selection/assignment  strategies.  One  simple  selection/assignment  method  would  be  a  two-stage 
strategy  in  which  applicants  are  selected  based  on  a  single  predictor  and  then  assigned  to  specific 
jobs  using  multiple  assignment  variables.  However,  a  possibly  more  efficient 

selection/assignment  strategy  for  practical  implementation  would  be  a  simultaneous  selection  and 
optimal  classification  system  called  the  multidimensional  screening  (MDS)  procedure  (Johnson 
&  Zeidner,  1990,  1991). 

The  MDS  procedure  is  best  understood  in  the  context  of  Brogden’s  (1959)  model  where 
each  predictor  is  an  FLS  composite  yielding  a  score  that  divides  into  a  general  (g)  and  a  unique 
(u)  component.  Brogden  (1959)  discussed  an  assignment  strategy  in  which  applicants  are 
simultaneously  selected  and  classified  into  jobs  using  only  the  unique  components,  and  he  states 
that  "removal  of  the  common  component  will  be  shown  to  have  no  effect  on  the  classification 
of  (individuals]  or  on  the  allocation  average"  (p.  184).  MDS  is  a  modification  of  Brogden’s 
model  to  reflect  a  simultaneous  strategy  in  which  selection  and  classification  is  accomplished 
using  a  separate  FLS  composite  for  each  job  that  incorporates  both  the  g  and  u  components. 
This  strategy  is  an  important  improvement  in  Brogden’s  model  because  it  allows  for  a  larger 
gain  in  mean  predicted  performance  due  to  selection  when  g  constitutes  a  large  part  of  each 
score  (as  is  usually  the  case). 

Whetzel  (1991)  completed  a  simulation  study  that  compared  three  methods  of 
selection/assignment:  selection  on  g  and  then  assignment  on  the  FLS  composite  (two-stage 
strategy);  selection  and  assignment  based  only  on  g;  and  simultaneously  selecting  and  assigning 
on  FLS  (multidimensional  screening).  Whetzel  (1991)  found  that  MDS  was  far  superior  in  terms 
of  gain  in  MPP  compared  to  selecting  and  assigning  solely  on  g.  MDS  was  also  statistically 
greater  in  terms  of  gains  in  MPP  than  selection  on  g  and  assigning  on  FLS,  but  the  gains  were 
more  modest.  It  was  concluded  from  this  study  that  the  largest  and  most  dramatic  increase  in 
MPP  comes  from  the  use  of  FLS  composites  in  a  two-stage  selection/classification  process.  A 
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smaller,  but  still  worthwhile,  improvement  results  from  the  integration  of  selection  and 
classification  procedures  using  the  MDS  algorithm. 

5.  Changing  Criterion  Variables  and  Increasing  Validity  Information 

Finally,  all  approaches  relating  to  a  redesign  of  the  classification  system  could  be  made 
more  effectively,  providing  greater  classification  efficiency,  if:  (1)  the  criterion  variables  were 
more  reliable  and  more  accurate  measures  of  the  value  of  the  individual  in  the  accomplishment 
of  the  mission;  and  (2)  the  analysis  samples  on  which  validity  data  are  computed  were  larger. 
Our  knowledge  of  the  effect  of  sample  size  on  the  stability  of  regression  weights  is  extensive. 
However,  this  knowledge  does  not  translate  to  predicting  the  effect  of  analysis  sample  size  on 
MPP  after  optimal  assignment  of  a  pool  of  candidates  to  jobs.  While  we  do  not  in  this  study 
directly  measure  the  impact  of  criterion  quality  and  the  size  of  analysis  samples  on  MPP,  further 
insight  on  this  issue  can  be  obtained  from  this  study. 

D.  Current  Trends 

1.  Validity  Generalization 

In  recent  decades,  there  has  been  a  steady  decline  in  research  and  application  pertaining 
to  classification.  The  most  popular  trend  in  personnel  research  in  recent  decades  has  been  the 
validity  generalization  movement  (Schmidt  &  Hunter,  1977).  The  research  that  has  come  out 
of  VG  has  led  to  the  conclusion  that  there  is  an  all-pervasive  general  cognitive  ability  (g) 
component  that  is  the  best  measure  for  predicting  job  performance.  Although  general  cognitive 
ability  contributes  substantially  to  efficient  selection,  it  leaves  little  room  for  classification  and 
has  led  to  a  general  pessimism  on  the  part  of  many  researchers  about  the  future  usefulness  of 
classification  batteries.  This  pessimism  is  unfounded,  however,  and  is  due  mainly  to 
misunderstandings  about  classification.  Differential  assignment  theory  has  been  introduced  to 
dispel  some  of  these  misunderstandings  and  to  demonstrate  the  tremendously  important  role  that 
classification  can  play  in  the  overall  utility  of  a  complete  personnel  utilization  system. 

There  is  a  general  resistance  to  DAT  mainly  because  there  is  a  tendency  to  confuse  it 
with  what  is  often  called  either  "specific  aptitude  theory"  or  "differential  aptitude  theory"  (see 
Schmidt,  Hunter,  &  Larson,  1988).  Specific  aptitude  theory  had  its  origins  with  the  work  of 
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researchers  such  as  Guilford  (1956,  1957,  1959),  Hull  (1928),  and  Thurstone  (1938).  The  idea 
behind  specific  aptitude  theory  is  that  there  are  certain  aptitudes  that  should  be  relevant  for 
predicting  performance  on  certain  jobs.  Thus,  a  math  test  should  predict  work  requiring 
numerical  skills  while  a  verbal  test  should  predict  work  requiring  verbal  skills.  Under  ideal 
circumstances,  each  test  would  measure  a  separate  aptitude,  thus  mandating  low  intercorrelations 
among  the  tests.  A  composite  of  these  tests  could  be  constructed  through  the  use  of  multiple 
regression  in  order  to  predict  success  in  the  job  or  job  family  for  which  it  was  constructed.  In 
1928,  Clark  Hull  published  a  book  on  aptitude  testing  in  which  he  stated  his  differential  aptitude 
hypothesis.  This  hypothesis  asserts  that  a  tailored  composite  of  specific  tests  could  make  an 
incremental  contribution  to  the  prediction  of  performance  over  and  above  the  contribution  of 
general  cognitive  ability.  Through  the  use  of  factor  analysis,  a  great  deal  of  research  was  done 
to  identify  specific  aptitudes  that  represented  the  structure  of  human  abilities.  These  specific 
abilities  were  needed  in  order  to  build  tailored  aptitude  test  batteries  consistent  with  specific 
aptitude  theory.  Thurstone’s  (1938)  studies  resulted  in  the  identification  of  seven  factors  which 
he  termed  the  "primary  mental  abilities".  Guilford  (1956,  1957,  1959)  presented  a  scheme  to 
classify  known  factors  of  intelligent  behavior  that  resulted  in  a  theoretical  representation  of  the 
structure  of  the  intellect  composed  of  120  different  factors. 

DAT  is  different  from  specific  aptitude  theory  in  two  major  ways.  First,  in  constructing 
a  classification  test  battery,  emphasis  is  placed  on  accentuating  the  differences  between  predicted 
measures  of  success.  Horst’s  (1954)  differential  validity  index  facilitates  the  selection  of 
predictors  for  inclusion  in  such  a  classification-efficient  test  battery.  The  goal  is  to  have  a  set 
of  predictors  that  capitalizes  on  any  and  all  inter-  and  intra-individual  ability  differences.  It  is 
not  necessary  for  each  predictor  to  represent  a  different  aptitude,  and  it  is  not  necessary  that  the 
predictors  have  low  intercorrelations.  Brogden  (1951,  1959)  demonstrated  that  high  predictor 
intercorrelations  do  not  reduce  classification  efficiency  as  much  as  previously  thought. 

The  second  major  way  that  DAT  is  different  from  specific  aptitude  theory  is  that  in  order 
for  differential  assignment  to  be  maximally  efficient,  full  least  square  regression  equations  (FLS) 
should  be  used  as  the  best  estimate  of  actual  criterion  performance.  This  is  contrary  to  specific 
aptitude  theory  which  has  been  implemented  through  the  use  of  unit-weighted  composites 
consisting  of  a  reduced  number  of  tests  than  are  in  the  total  battery.  Allocation  to  jobs  based 
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upon  a  full  least  square  regression  equation  for  the  entire  battery  provides  for  maximally 
efficient  assignment  according  to  differential  assignment  theory. 

Another  misunderstanding  about  classification  research  is  the  belief  that  predictive 
validity  can  be  used  to  evaluate  the  effectiveness  of  classification.  Some  recent  articles 
discussing  classification  effects  in  terms  of  predictive  validity  include  Hunter  (1986),  Schmidt, 
Hunter,  and  Larson  (1988),  and  Thorndike  (1986).  A  more  appropriate  figure  of  merit  for 
evaluating  classification  effects  is  mean  predicted  performance  (MPP).  When  dealing  with  a 
simple  univariate  selection  model,  the  validity  coefficient  is  directly  proportional  to  MPP  when 
the  selection  ratio  is  held  constant  and  the  relatively  simple  optimal  selection  algorithm  is  used 
(i.e.,  the  rank  ordering  of  applicants  on  predicted  performance  and  selection  in  order  from  the 
top  down).  However,  when  dealing  with  a  more  complicated  multivariate  model  required  for 
classification  there  are  no  simple  analytical  methods  for  computing  MPP.  In  fact,  the  only 
practical  solution  is  to  use  real  or  synthetic  data  as  input  into  simulations  of  personnel  utilization 
strategies.  MPP  can  then  be  calculated  from  the  simulation  to  evaluate  potential  classification 
efficiency  (PCE)  of  various  personnel  assignment  strategies.  What  may  not  be  obvious  is  that 
predictive  validity  is  relegated  by  the  underlying  mathematics  to  what  in  many  cases  may  be  a 
minor  role  in  achieving  an  increase  in  classification  efficiency.  Under  certain  conditions,  one 
set  of  test  composites  having  a  smaller  average  predictive  validity  than  another  could  actually 
possess  greater  classification  efficiency  (Zeidner  &  Johnson,  1989b,  1991b). 

The  trend  in  recent  times  is  to  devote  all  attention  to  increasing  the  predictive  validity 
of  test  batteries  without  concern  for  differential  validity  needed  for  efficient  classification.  This 
trend  is  due  primarily  to  the  emphasis  that  VG  places  on  a  dominant  g  factor.  The  development 
of  aptitude  test  batteries  in  the  U.S.  Army  over  the  years  has  certainly  been  affected  by  this 
trend.  One  of  the  Army’s  first  set  of  aptitude  tests  was  the  Army  Classification  Battery  (ACB). 
As  the  name  suggests,  there  was  considerable  emphasis  placed  on  the  ACB’s  ability  to  classify 
individuals  into  jobs  efficiently  during  the  first  fifteen  years  of  its  use  (Zeidner,  1987). 
Unpublished  Army  studies  show  a  generally  declining  trend  in  the  amount  of  classification 
efficiency  present  with  each  change  of  ACB  content  during  the  period  that  the  ACB  was  being 
transitioned  into  the  current  ASVAB.  Furthermore,  the  use  of  unit- weighted  aptitude  area 
composites  further  erodes  the  classification  potential  of  the  ASVAB.  This  trend  has  continued 
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with  the  experimental  pool  of  predictors  developed  recently  for  the  U.S.  Army’s  Project  A. 
These  predictors  were  assembled  with  the  goal  of  increasing  predictive  validity,  rather  than 
differential  validity  (McHenry,  Hough,  Toquam,  Hanson,  &  Ashworth,  1990). 

2.  Decreasing  the.  Number  of  Job  Families 

Another  disturbing  trend  in  recent  times  is  the  tendency  for  researchers  to  favor 
decreasing  the  number  of  job  families  in  operational  systems.  This  trend,  once  again,  is  caused 
primarily  by  a  focus  only  on  selection  efficiency  and  increasing  support  for  general  cognitive 
ability  as  sufficient  for  predicting  performance  in  all  jobs.  The  result  of  considering  the  g  factor 
as  the  only  significant  predictor  of  performance  in  all  jobs  is  that  the  differences  between  jobs 
are  diminished  and  the  need  for  numerous  job  families  decreases. 

The  Army  currently  has  nine  job  families,  the  Navy  has  11  families,  the  Air  Force  has 
four  families,  and  the  Marines  have  six.  Other  large  organizations  have  similarly  fairly  small 
numbers  of  job  families.  For  example,  the  Office  of  Personnel  Management  (OPM)  has  recently 
found  seven  job  families  to  be  representative  of  the  professional  and  administrative  jobs  in  the 
federal  government  (Rheinstein,  McCauley,  &  O’Leary,  1989b).  These  job  families  were  used 
in  the  development  of  the  new  Administrative  Careers  with  America  examination.  The 
Department  of  Labor,  based  on  the  research  of  Hunter  (1983),  is  using  five  job  families 
(clustered  by  job  complexity,  rather  than  by  job  similarity)  to  represent  all  12,000  jobs  in  the 
Dictionary  of  Occupational  Titles. 

Thus,  there  is  a  tendency  for  job  family  systems  developed  for  large  organizations  in 
recent  times  to  include,  on  the  average,  approximately  four  to  seven  job  families.  The  Army 
and  the  Navy  are  the  exceptions  in  that  they  are  still  using  9  and  11  job  families,  respectively. 
However,  there  have  recently  been  serious  suggestions  that  the  Army  decrease  their  number  of 
job  families  to  four  (McLaughlin,  Rossmeissl,  Wise,  Brandt,  &  Wang,  1984). 

The  Air  Force  currently  has  four  job  families  that  match  the  number  of  strong  group 
factors  in  the  ASVAB.  Some  believe  that  basing  the  number  of  job  families  on  the  number  of 
strong  factors  in  the  test  battery  is  the  most  appropriate  method.  One  of  the  main  purposes  of 
the  present  research  is  to  demonstrate  that  it  would  be  a  serious  mistake  for  the  U.S.  Army  to 
decrease  their  number  of  job  families  to  four.  Decreasing  the  number  of  Army  job  families 
would  result  in  a  further  erosion  of  any  classification  potential  in  the  Army’s 
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selection/assignment  system.  This  research  is  designed  to  demonstrate  that  actually  increasing 
the  number  of  job  families  beyond  the  current  nine  could  begin  progress  towards  a  more  optimal 
selection/assignment  system  for  the  U.S.  Army. 

E.  Research  Approach 

This  research  utilizes  a  model  sampling  approach  combined  with  a  computer  simulation 
of  the  selection  and  classification  process.  Model  sampling  involves  the  generation  of  synthetic 
entities  that  have  specified  statistical  characteristics  in  common  with  empirical  random  samples 
drawn  from  the  empirical  sample.  In  this  approach,  actual  empirical  databases  with  covariances 
between  predictors  and  criteria  provide  the  parameter  values  that  define  the  designated 
population.  These  parameters  form  the  basis  for  the  generation  of  synthetic  entities  with  test 
scores  that  have  the  same  expected  means  and  covariances  as  the  designated  population.  The 
parameters  of  the  synthetic  samples  differ  from  those  of  the  designated  population  by  an  amount 
of  sampling  error  that  is  related  to  sample  size  as  though  they  were  empirical  samples.  This 
model  sampling  approach  has  many  advantages  over  simply  using  existing  empirical  database 
scores  in  the  simulations. 

Model  sampling  provides  increased  flexibility  in  that  samples  of  any  number  and  size  can 
be  generated  for  any  universe,  including  a  current  or  future  youth  population,  if  that  universe 
can  be  defined  by  both  the  covariances  among  the  relevant  predictor  variables  and  the  validities 
of  these  variables  against  all  criterion  components.  It  could  be  argued  that  the  shape  of  a  score 
distribution  would  be  more  realistic  for  a  simulation  using  empirical  scores  rather  than  synthetic 
scores  generated  to  have  a  normal  distribution.  However,  with  a  little  extra  effort,  synthetic 
scores  can  be  generated  to  reflect  any  degree  of  censoring  that  is  desired,  and  has  the  added 
advantage  that  distributions  can  be  produced  that  are  closer  to  a  distribution  of  a  future 
population  than  is  provided  by  the  detailed  shape  of  the  distributions  of  the  past  years.  Finally, 
model  sampling  allows  the  evaluation  of  conditions  which  could  affect  the  system  but  are  not 
available  in  terms  of  actual  empirical  data. 

For  the  present  research,  the  primary  advantage  of  the  model  sampling  approach  is  that 
a  cross-validation  design  can  be  utilized  that  provides  a  rigorous  methodological  investigation 
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of  the  various  experimental  conditions  in  this  experiment.  For  one  set  of  conditions  in  this 
research,  model  sampling  allows  the  generation  of  an  analysis  sample  independent  of  the 
designated  population  for  use  in  job  clustering  and  definition  of  assignment  variables.  For  all 
conditions,  the  model  sampling  approach  allows  the  generation  of  20  independent  cross-samples 
that  vary  in  size  depending  upon  the  demands  of  the  design.  Thus,  by  using  model  sampling 
techniques  it  is  possible  to  essentially  replicate  each  condition  in  this  experiment  20  times. 

The  model  sampling  technique  used  in  this  research  is  made  more  credible  by  the  realism 
of  the  designated  population  made  possible  by  two  empirical  databases  from  the  Army  Selection 
and  Classification  Project  (Project  A).  The  two  parts  of  this  research,  Design  A  and  Design  B, 
make  use  of  the  two  databases  differently.  Design  A  is  based  upon  the  18  jobs  investigated  in 
the  concurrent  validation  phase  of  Project  A  (Campbell,  1990).  Although  data  were  collected 
for  only  18  jobs  in  this  phase,  validation  data  on  20  new  experimental  tests  with  carefully 
designed  performance  criteria  is  available  in  this  database. 

For  Design  A,  the  same  18  jobs  also  were  extracted  from  the  second  database  from  the 
early  stages  of  Project  A  called,  for  the  purposes  of  this  research,  the  "McLaughlin"  database 
(McLaughlin,  Rossmeissl,  Wise,  Brandt,  &  Wang,  1984).  The  "McLaughlin"  database  contains 
validation  data  for  the  ASVAB  and  the  Skill  Qualification  Test  (SQT)  and  training  score  criteria. 
The  "McLaughlin"  database  is  utilized  in  Design  A  to  compare  the  use  of  the  less  appropriate 
SQT  criterion  (constructed  for  use  as  a  training  diagnostic  tool)  with  the  specially  developed 
Core  Technical  Proficiency  (CTP)  criterion  developed  in  the  concurrent  validation  phase  of 
Project  A. 

However,  the  "McLaughlin"  database  plays  an  even  more  important  role  for  Design  B. 
The  "McLaughlin"  database  contains  validation  data  for  over  98  jobs  of  which  60  jobs  were 
selected  for  this  research.  The  availability  of  60  jobs  with  validation  data  based  on  moderately 
large  sample  sizes  is  very  important  to  the  operational  implications  of  this  research.  It  makes 
it  possible  to  compare  much  more  substantial  and  realistic  sets  of  operational  and  empirical  job 
families  than  is  possible  with  the  more  limited  set  of  jobs  in  Design  A.  Both  databases  will  be 
described  in  more  detail  in  the  next  chapter. 

In  Design  A,  three  primary  areas  relating  to  job  structure  are  investigated.  First,  it  is 
expected  that  there  will  be  a  significant  improvement  in  MPP  as  the  number  of  job  families  to 
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which  individuals  are  classified  increases.  For  Design  A,  the  number  of  job  families  will 
increase  from  6  to  9  to  12.  Second,  it  is  expected  that  there  will  be  significantly  greater 
improvement  in  MPP  with  jobs  empirically  clustered  into  job  families  specifically  to  maximize 
classification  efficiency  compared  with  jobs  empirically  clustered  into  job  families  specifically 
to  maximize  selection  efficiency.  Third,  the  empirical  methods  of  job  clustering  developed  here 
are  expected  to  result  in  significantly  greater  improvement  in  MPP  than  the  operational  job 
families  currently  being  used  by  the  U.S.  Army. 

A  number  of  secondary  areas  also  are  investigated  in  Design  A.  In  Design  A,  it  is 
possible  to  examine  whether  the  efficiency  of  classification  varies  with  the  number  of  job 
families  according  to  a  negatively  accelerated  function.  This  idea  was  originally  proposed  by 
Brogden  (1959).  It  is  expected  that  the  increase  in  MPP  from  6  to  9  job  families  will  be  greater 
than  the  increase  in  MPP  from  9  to  12  job  families.  In  addition,  Design  A  examines  the  effects 
on  classification  efficiency  of  expanding  the  number  of  predictors  from  nine  ASVAB  tests  to 
nine  ASVAB  tests  plus  20  experimental  predictors.  It  is  expected  that  the  expanded  predictor 
space  will  provide  an  improvement  in  MPP  compared  to  the  use  of  only  the  ASVAB.  Design 
A  also  contains  a  set  of  conditions  to  compare  the  use  of  FLS  composites  instead  of  aptitude  area 
composites  for  assignment.  It  is  expected  that  the  use  of  FLS  composites  will  result  in 
significantly  greater  improvements  in  MPP  than  the  current  aptitude  area  assignment  system. 
Finally,  in  Design  A,  the  effects  on  classification  efficiency  of  using  the  SQT  criterion  instead 
of  the  CTP  criterion  are  investigated.  It  is  expected  that  substituting  the  SQT  criterion  for  the 
CTP  criterion  will  result  in  no  differences  in  the  conclusions  reached  about  any  of  the  primary 
or  secondary  areas  just  discussed. 

In  Design  B,  it  is  important  to  demonstrate  the  job  clustering  methods  with  a  large 
number  of  jobs.  The  other  expectations  in  Design  B  are  similar  to  Design  A  in  that  (1)  the 
magnitude  of  the  MPP  scores  are  expected  to  increase  as  the  number  of  job  families  increase, 
and  (2)  the  empirical  methods  of  forming  job  families  will  be  compared  to  the  operational  job 
families  currently  used  by  the  Army.  In  Design  B,  however,  the  number  of  job  families  will 
be  increased  from  9  to  16  to  23.  This  increase  provides  a  further  opportunity  to  examine 
Brogden’ s  proposal  that  the  efficiency  of  classification  varies  with  the  number  of  job  families 
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according  to  a  negatively  accelerated  function.  Hopefully,  some  conclusions  can  be  reached 
about  the  efficiency  of  utilizing  as  many  as  23  different  job  families  from  Design  B. 
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II.  RESEARCH  METHOD 


A.  Data  Description  and  Corrections 

Two  empirical  databases  will  be  utilized  in  the  present  research.  Both  of  these  databases 
were  part  of  the  Army  Selection  and  Classification  Project  (Project  A).  The  first  data  set  used 
in  this  research  comes  from  the  Project  A  effort  to  generate  new  predictor  and  criterion 
measures  to  enhance  the  selection  and  classification  system  for  all  entry-level  positions  in  the 
United  States  Army.  In  a  concurrent  validation  phase  of  Project  A,  the  AS VAB  along  with  new 
predictor  and  criterion  measures  were  administered  to  incumbents  who  entered  the  Army  in  1983 
or  1984.  This  data  set  forms  the  first  database  used  in  the  present  research  and  will  be  called 
simply  the  "Project  A"  data  set. 

The  second  data  set  comes  from  the  early  stages  of  Project  A  which  concentrated  on 
validating  the  current  ASVAB  aptitude  area  composites  and  considering  alternative  composites 
of  the  ASVAB.  For  this  purpose,  available  computer  records  containing  ASVAB  predictor 
scores  along  with  criterion  measures  consisting  of  training  school  grades  and  the  Skill 
Qualification  Test  (SQT)  were  drawn  for  people  who  joined  the  Army  in  1981  and  1982.  The 
analysis  of  these  data  are  reported  in  McLaughlin,  Rossmeissl,  Wise,  Brandt,  and  Wang  (1984). 
These  data  form  the  second  database  used  in  the  present  research  and  will  be  called  the 
"McLaughlin"  data  set. 

1,  Jpb  Sflmplg 

The  Project  A  concurrent  validation  data  set  contained  validation  data  for  19  Military 
Occupational  Specialties  (MOS).  There  was  only  one  modification  made  to  this  database  for  the 
present  study.  One  of  the  MOS,  5 IB  C  sentry  and  Masonry  Specialist,  was  not  used  in  this 
study  because  it  had  a  very  small  sampie  size  (n=69)  compared  to  the  other  MOS  in  the 
database,  and  it  resulted  in  an  unstable  factor  structure  when  its  use  was  attempted  in  previous 
research  (Whetzel,  1991).  The  "McLaughlin"  data  set  contained  validation  data  for  98  MOS. 
Of  these  98  MOS,  the  same  18  jobs  contained  in  the  Project  A  concurrent  validation  sample 
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were  selected  for  the  first  part  of  this  study  (Design  A).  An  additional  42  MOS  forming  a  total 
set  of  60  jobs  were  selected  for  the  second  part  of  this  study  (Design  B).  The  total  number  of 
jobs  was  set  at  60  because  it  was  originally  estimated  that  the  largest  number  of  job  families  that 
would  be  created  would  be  approximately  30  job  families.  A  reasonable  number  of  jobs  was 
determined  to  be  at  least  twice  the  number  of  job  families. 

The  additional  42  jobs  selected  to  compose  the  "McLaughlin"  database  for  Design  B  were 
chosen  out  of  the  possible  98  available  MOS  in  a  two  stage  selection  process.  In  the  first  stage, 
all  jobs  with  a  sample  size  greater  than  200  were  selected.  Including  the  original  18  Project  A 
MOS,  this  process  identified  50  jobs.  Two  jobs  were  eliminated  because  they  did  not  have 
reliability  information.  Five  jobs  were  eliminated  because  they  shared  obvious  similarities  to 
jobs  already  included  in  the  database  (e.g.,  three  personnel  jobs  and  two  helicopter  repair  jobs). 
Thus,  at  the  end  of  the  first  stage  there  were  43  candidate  jobs.  In  the  second  stage,  the 
remaining  jobs  were  reviewed  as  candidates  to  complete  the  set  of  60  jobs.  The  following 
criteria  were  used  in  selection:  (1)  desire  to  include  jobs  that  were  in  as  many  of  the  different 
Career  Management  Fields  (CMF)  as  possible;  (2)  availability  of  reliability  data;  and  (3)  sample 
sizes  close  to  or  greater  than  100.  Using  these  criteria,  17  additional  jobs  were  selected  to 
complete  the  set  of  60  jobs. 

Appendix  A  (Table  A-l)  lists  the  18  jobs  contained  in  the  Project  A  concurrent  validation 
data  set  along  with  the  sample  sizes  for  these  18  jobs  for  both  the  Project  A  data  set  and  the 
"McLaughlin"  data  set.  Note  that  the  average  sample  size  per  job  for  the  Project  A  data  set  was 
388  and  the  average  sample  size  per  job  for  the  "McLaughlin"  data  set  is  2,370.  Appendix  A 
(Table  A-2)  lists  the  total  set  of  60  jobs  and  their  sample  sizes  for  the  "McLaughlin"  data  set. 
The  average  sample  size  across  these  60  jobs  is  1,002. 

2.  Predictors  and  Criteria 

For  this  study,  the  Project  A  concurrent  validation  predictors  included  the  nine  ASVAB 
tests  plus  an  expanded  set  of  20  additional  experimental  predictors.  These  new  predictors  were 
designed  to  capture  cognitive  and  noncognitive  abilities  not  covered  by  the  ASVAB:  spatial 
visualization  and  orientation,  perception  and  psychomotor  skills,  temperament/personality, 
vocational  interest,  and  job  orientation.  Appendix  B  (Table  B-l)  lists  the  ASVAB  tests  and  the 
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20  additional  predictors  along  with  their  reliabilities.  The  "McLaughlin"  predictors  included 
only  the  ASVAB  tests. 

The  criterion  measures  used  in  this  study  were  Core  Technical  Proficiency  (CTP)  for  the 
Project  A  concurrent  validation  data  set  and  the  Skill  Qualification  Test  (SQT)  for  the 
"McLaughlin"  data  set.  Three  jobs  that  matched  the  Project  A  jobs  in  the  "McLaughlin"  data 
set  lacked  SQT  scores,  so  end-of-course  training  scores  were  substituted  for  these  jobs. 

The  CTP  criterion  was  chosen  for  this  study  instead  of  one  or  more  of  the  other  four 
criterion  components  developed  as  part  of  the  Project  A  concurrent  validation  effort  because  it 
represents  MOS-specific  performance.  The  CTP  criterion  was  designed  to  measure  the 
proficiency  with  which  the  soldier  performs  the  tasks  that  are  "central"  to  the  MOS  (Campbell, 
Ford,  Rumsey,  Pulakos,  Borman,  Felker,  DeVera,  &  Riegelhaupt,  1990).  It  is  composed  of 
both  hands-on  and  paper-and-pencil  measures  of  MOS-specific  task  proficiency.  The  MOS- 
specific  aspect  of  the  CTP  criterion  is  important  because  it  provides  for  greater 
multidimensionality  in  the  joint  predictor-criterion  space.  It  is  desirable  to  have  a  criterion  that 
differentiates  between  jobs  to  demonstrate  classification  effects.  Evidence  from  previous 
research  supports  the  notion  that  CTP  is  better  for  differentiating  between  jobs.  Wise, 
Campbell,  and  Peterson  (1987)  reported  that  the  optimal  component  for  differentiating  between 
jobs  was  CTP,  with  the  four  other  components  showing  little  added  value  for  this  purpose.  In 
addition,  in  a  preliminary  factor  analysis  done  as  part  of  the  Whetzel  (1991)  study,  it  was  found 
that  when  all  five  criteria  were  used  in  factoring  the  predictor-criterion  covariances,  a  strong 
simple  structure  did  not  emerge.  In  other  words,  jobs  did  not  load  highly  on  one  factor  and  near 
zero  (or  at  least  much  lower)  on  all  other  factors,  hence  the  loadings  did  not  show  distinct 
differentiation  of  jobs  on  the  factors.  There  was  much  better  differentiation  found  when  the  CTP 
criterion  was  used  alone.  The  reliability  of  the  CTP  criterion  used  for  the  present  study  was  .85 
(Zeidner,1987). 

The  SQT  criterion  measure  available  in  the  "McLaughlin"  data  set  also  represents  MOS- 
specific  performance.  SQTs  have  been  administered  by  the  Army  since  1977  to  assess  soldiers’ 
qualifications  for  promotion  and  to  evaluate  the  overall  effectiveness  of  Army  training  programs. 
Each  year  a  separate  SQT  is  constructed  for  each  MOS  and  skill  level  within  that  MOS.  SQTs 
may  sample  from  12  to  36  tasks,  and  soldiers  are  allowed  to  prepare  in  advance  for  the  tasks 
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to  be  tested.  A  test  may  consist  of  both  hands-on  and  paper-and-pencil  job  knowledge  items. 
However,  for  the  period  of  data  covered  in  the  McLaughlin  et  al.  (1984)  analyses,  only  paper- 
and-pencil  job  knowledge  items  were  available. 

The  "McLaughlin"  database  also  contained  end-of-course  training  scores.  These  are  tests 
that  are  developed  at  the  schools  for  the  purpose  of  testing  whether  the  students  have  learned 
what  had  been  taught.  McLaughlin  et  al.  (1984)  used  a  combination  of  the  two  criterion 
measures  for  several  of  their  analyses.  For  the  present  study,  it  was  decided  that  the  SQT 
criterion  measure  would  be  preferable  to  a  combined  criterion  or  the  end-of-course  training 
scores  alone.  There  were  two  reasons  for  this  decision.  First,  although  both  the  SQT  measures 
and  the  end-of-course  measures  are  essentially  criterion-referenced  tests,  the  SQT  measures 
appeared  from  McLaughlin  et  al.  (1984)  to  be  better  psychometrically  yielding  a  higher  average 
validity.  Second,  after  some  investigation  it  was  discovered  that  it  was  possible  to  obtain  fairly 
accurate  estimates  of  reliability  for  the  SQT  criterion  but  not  for  the  end-of-course  training 
criterion.  Reliability  estimates  were  needed  for  all  of  the  criterion  measures  used  in  this  study 
so  that  corrections  for  criterion  attenuation  could  be  made. 

These  reliability  estimates  were  obtained  by  contacting  the  U.S.  Army  Training  Support 
Center  in  Fort  Eustis,  Virginia.  It  was  discovered  that,  since  1987,  the  U.S.  Army  has  been 
collecting  extensive  reliability  information  for  the  SQTs.  For  the  present  study,  Cronbach  alpha 
reliability  estimates  were  obtained  for  the  SQTs  corresponding  to  each  of  the  60  MOS  for  the 
years  1987,  1988  and  1989.  There  were  only  10  MOS  without  reliability  estimates  for  all  three 
years.  Appendix  C  contains  the  reliability  information  across  all  three  years  for  the  60  jobs  in 
the  "McLaughlin"  data  set. 

These  Cronbach  alpha  reliability  estimates  proved  to  be  the  best  information  available 
about  SQT  reliability  so  it  was  decided  to  correct  each  MOS  in  the  "McLaughlin"  database  for 
criterion  attenuation  based  upon  the  average  reliability  for  that  MOS  across  the  three  years. 
Having  three  years  of  data  should  provide  a  consistent  enough  estimate  of  reliability  to 
compensate  for  the  fact  that  SQTs  are  changed  from  year  to  year  with  the  actual  data  being  used 
in  this  study  from  1981  and  1982.  It  is  encouraging  to  note  that  the  average  reliabilities  across 
MOS  were  consistent  across  the  years.  The  average  alphas  for  1987,  1988,  and  1989  were  .83 
(n=2688),  .83  (n=2893),  and  .81  (n=3018),  respectively. 
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As  mentioned,  there  were  three  MOS  (16S,  76Y,  and  91A)  in  which  end-of-course 
training  data  was  used  instead  of  SQTs.  Obtaining  reliability  data  for  these  jobs  presented  an 
additional  problem  because  the  training  schools  for  these  MOS  indicated  that  there  was  no 
reliability  data  available.  Therefore,  it  was  decided  that  the  SQT  reliability  data  would  have  to 
be  used  as  a  best  estimate  of  training  reliability.  However,  from  the  McLaughlin  et  al.  (1984) 
report  it  was  apparent  that  the  end-of-course  criterion  data  were  probably  much  less  reliable  than 
the  SQT  criterion  data.  The  average  adjusted  training  criterion  validity  in  McLaughlin  et  al. 
(1984)  was  .40,  while  the  average  adjusted  SQT  validity  was  .46.  In  order  to  remedy  this 
problem,  an  adjustment  was  made  to  the  reliability  estimates  for  MOS  16S,  76Y,  and  91A  based 
on  a  ratio  of  how  much  lower  the  reliability  of  the  training  data  would  have  to  be  to  obtain  a 
validity  estimate  .06  points  lower.  Without  this  adjustment,  the  reliability  of  the  criterion  for 
the  MOS  16S,  76Y  and  91A  would  be  greater  than  their  corresponding  validities  indicate  so  that 
the  subsequent  correction  of  these  validity  values  for  criterion  unreliability  would  be  less  than 
it  should  be.  Thus,  this  problem  would  introduce  an  inconsistency  between  these  three  MOS 
and  the  other  MOS.  This  adjustment  resulted  in  reliability  estimates  of  .58,  .66,  and  .62  for 
MOS  16S,  76Y,  and  91A,  respectively. 

3.  Data  Corrections 

Both  of  the  Project  A  concurrent  validation  and  "McLaughlin"  data  sets  must  be 
corrected  for  restriction  in  range  and  criterion  attenuation.  For  the  Project  A  data  set,  all 
corrections  were  done  as  part  of  an  earlier  study  for  the  Institute  for  Defense  Analyses  (Johnson, 
Zeidner,  &  Scholarios,  1990).  These  corrections  will  be  described  below.  The  corrections  to 
the  "McLaughlin"  data  set  were  done  for  the  purposes  of  the  present  research. 

One  other  set  of  corrections  will  also  be  described  in  this  section  that  were  discovered 
to  be  necessary  during  the  research  done  by  Whetzel  (1991).  For  the  Project  A  data,  when  the 
covariances  among  the  predicted  performance  scores,  Cp,  were  factor  analyzed  it  was  discovered 
that  three  of  the  eigenvalues  of  Cp  were  negative.  This  indicated  that  the  V  matrix  (validity 
matrix)  used  for  these  calculations  was  not  positive  semi-definite.  This  problem  can  arise  from 
computing  validities  against  the  different  job  criteria  on  separate  samples  of  individuals,  as 
contrasted  with  the  ideal  research  situation  in  which  all  validities  are  computed  on  the  same  set 
of  individuals. 
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a.  Corrections  for  Restriction  in  Range 

A  matrix  of  ASVAB  (form  8)  intercorrelations  for  a  national  sample  of  American  18-23 
year  olds  was  used  as  the  "1980  Reference  Youth  Population"  (Mitchell  &  Hanser,  1984).  The 
availability  of  the  ASVAB  youth  population  intercorrelation  matrix  enabled  the  nine  ASVAB 
tests  to  be  treated  as  "explicit"  predictor  variables,  i.e.,  variables  drawn  from  an  unrestricted 
population  (Appendix  D,  Table  D-l  contains  the  1980  youth  population  intercorrelation  matrix). 
For  the  Project  A  concurrent  validation  data,  the  20  experimental  predictors  were  treated  as 
"implicit"  predictor  variables  since  the  degree  of  their  restriction  was  determined  entirely  as  a 
function  of  their  correlation  with  the  explicit  variables  that  are  directly  restricted  by  the  selection 
process.  The  correction  procedure  is  based  on  Lawley’s  (1943)  assumption  that  the  regression 
of  the  implicit  predictors  on  the  explicit  predictors  is  linear  and  that  the  covariances  of  the 
restricted  variables  exhibit  homoscedasticity.  Gulliksen’s  formulae  (1950,  p.165,  numbers  37 
and  42)  were  applied  to  the  youth  population  covariance  matrix  for  the  ASVAB  tests  (explicit 
variables)  and  the  covariance  matrix  for  the  20  Project  A  predictors  (implicit  variables)  for  the 
aggregate  of  the  18  MOS.  The  result  was  a  corrected  variance-covariance  matrix  that  was  then 
easily  converted  into  correlation  coefficients  forming  the  corrected  intercorrelation  matrix  (RJ 
of  29  predictors  (see  Appendix  D,  Table  D-2). 

The  same  Gulliksen  formulae  were  then  used  to  correct  the  validities  for  implicit 
restriction  in  range  effects  on  the  criterion  in  Project  A.  In  this  case,  all  predictors  were  treated 
as  explicit  variables  and  only  the  criterion  was  implicit.  Once  again  it  was  then  an  easy 
procedure  to  convert  the  covariance  matrix  for  the  unrestricted  predictors  and  the  implicitly 
restricted  criterion  into  a  matrix  of  unrestricted  (population)  validity  coefficients  for  each  MOS 
(V).  For  the  "McLaughlin"  data  set,  the  correction  for  restriction  in  range  involved  only  this 
second  procedure.  In  the  "McLaughlin"  data  set,  there  were  only  the  nine  ASVAB  tests  as 
predictors  and  the  youth  population  intercorrelation  matrix  among  the  tests  formed  the  R,  matrix. 
The  computation  of  the  V  matrix  (of  correlation  coefficients)  came  directly  from  the  covariance 
matrix  for  the  population  predictors  and  criterion. 

b.  Corrections  for  Criterion  Unreliability 

The  validity  matrices  for  both  data  sets  were  then  further  corrected  for  criterion 
unreliability.  These  corrections  were  accomplished  using  the  general  formula  with  the  validity 
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coefficients  in  the  numerator  and  the  square  root  of  the  respective  component  reliabilities  in  the 
denominator.  For  the  Project  A  concurrent  validation  data  set,  a  criterion  reliability  of  .85  was 
used  in  the  corrections  for  the  CTP  criterion  (Zeidner,  1987).  As  described  earlier,  for  the 
"McLaughlin"  criterion,  each  MOS  was  corrected  separately  based  on  the  reliabilities  given  in 
Appendix  C.  Appendix  D  (Tables  D-3  and  D-4)  contain  the  corrected  validity  matrices  for  the 
Project  A  concurrent  validation  and  "McLaughlin"  data  sets. 
c.  The  Positive  Semi-definite  Condition 

It  is  easily  demonstrated  that  any  matrix  which  is  a  product  of  real  numbers  premultiplied 
by  the  transpose  of  that  matrix  can  have  no  negative  eigenvalues  (see  Appendix  E).  This 
condition  of  having  all  eigenvalues  equal  to  either  positive  real  numbers  or  zero  is  referred  to 
as  being  positive  semi-definite.  The  29  by  29  matrix  of  correlation  coefficients  among  the 
Project  A  concurrent  study  predictors  remained  positive  semi-definite  after  corrections  for 
restriction  in  range  and  criterion  unreliability.  No  adjustment  was  required  for  the  29  by  29  R, 
matrix  used  in  the  model  sampling  experiment  reported  by  Johnson,  Zeidner,  and  Scholarios 
(1990).  The  covariances  among  the  predicted  performance  variables,  Cp,  were  not  utilized  in 
the  Johnson,  Zeidner,  and  Scholarios  (1990)  study  and  Cp  was  not  tested  to  see  if  it  was  positive 
semi-definite. 

However,  the  Whetzel  (1991)  study  required  the  use  of  a  factor  solution  of  the  matrix 
Cp,  defined  as  Cp  =  V  R*'1  V’.  Whetzel  (1991)  found  that  Cp  computed  in  this  manner  did  not 
meet  the  positive  semi-definite  condition.  Since  it  was  apparent  from  previous  work  by  Johnson, 
Zeidner,  and  Scholarios  (1990)  that  R,  was  positive  semi-definite,  the  failure  of  Cp  to  meet  this 
condition  had  to  be  due  to  the  matrix  V;  apparently  it  would  not  be  possible  to  obtain  this 
particular  V  matrix  from  the  analysis  of  a  single  sample  of  either  empirical  or  synthetic  predictor 
and  criterion  scores.  A  very  small  adjustment  was  all  that  was  required  to  provide  a  V  matrix 
that  results  in  a  Cp  matrix  that  is  positive  semi-definite.  Appendix  E  details  the  steps  taken  to 
remove  the  effects  of  negative  roots  on  V. 

The  corrected  R,  and  V  matrices  for  the  60  job  samples  from  the  "McLaughlin"  data  did 
not  have  the  problem  of  negative  eigenvalues.  It  was  found  that  V  R,'1  V’,  where  V  is  a  18  by 
9  validity  matrix  and  R,  is  a  9  by  9  matrix  of  correlation  coefficients  among  the  ASVAB  tests, 
has  9  positive  eigenvalues  and  all  other  eigenvalues  equal  to  zero  (i.e.,  with  no  negative  roots). 
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Thus,  V  R,1  V’  is  positive  semi-definite,  and  both  R,  and  V  are  shown  to  result  from  data  that 
are  consistent  within  the  predictors  and  across  the  predictor  and  criterion  variable  sets. 


B.  Research  Design 
1.  Cross-Validation  Design 

The  model  sampling  paradigm  allowed  the  construction  of  a  carefully  controlled,  cross- 
validation  design  for  this  research.  Figure  1  is  helpful  in  understanding  this  design  feature.  The 
concept  from  which  the  cross-validation  model  sampling  design  is  derived  is  based  on  the 
assumption  that  the  empirical  data  (after  corrections  for  criterion  unreliability  and  restriction  in 
range)  provide  a  reasonable  estimate  of  the  population  intercorrelations  and  validity  coefficients 
used  in  this  research.  This  design  feature  was  implemented  differently  for  Design  A  and  Design 
B  of  this  research. 

For  Design  A,  the  generation  of  entity  samples  using  parameters  of  this  designated 
population  resulted  in:  (1)  an  analysis  sample  and  (2)  cross-validation  samples  (see  Figure  1). 
The  analysis  sample  was  used  for  job  clustering  and  for  the  computation  of  weights  to  be  applied 
to  predictors  in  order  to  form  composites  used  as  assignment  variables.  The  analysis  sample  was 
generated  from  random  numbers  that  were  transformed  to  yield  test  scores  in  independent  job 
samples  with  the  same  sample  sizes  and  expected  covariance  values  among  predictors  and 
criterion  as  the  parameters  of  the  designated  population.  The  algorithm  for  the  generation  of 
the  analysis  sample  for  Design  A  is  given  in  Appendix  F  along  with  the  analysis  sample 
predictor  intercorrelations  (Table  F-2)  and  the  analysis  sample  validity  coefficients  (Table  F-3). 
There  were  20  cross-validation  samples  generated  from  random  numbers  for  Design  A  with 
sample  sizes  of  264  entities  and  expected  covariances  equivalent  to  the  covariances  in  the 
designated  population  (generation  of  the  cross-samples  is  described  in  detail  in  the  procedure 
section).  Weights  computed  from  the  analysis  sample  are  applied  to  the  test  scores  of  the 
independent  cross-validation  samples  to  compute  assignment  variable  scores. 

From  Figure  1,  note  also  that  there  is  a  third  independent  source  used  for  evaluation  of 
the  assignment  process.  The  computation  of  the  weights  to  be  used  for  the  evaluation  of  the 
assignment  (to  compute  MPP)  come  directly  from  the  parameters  of  the  designated  population. 


FIGURE  V.  Typical  Model  Sampling  Paradigm 


MPP  RESULTS4 


’Job  validation  sample  sizes  equal  to  those  used  in  Project  A  first-term 
concurrent  validation  study. 

’Evaluation  weights  computed  from  Project  A  empirical  sample  designated 
as  the  population. 

’Sample  size  of  assigned  entities  number  from  200-300;  in  the  aggregate, 
N  numbers  in  the  thousands  for  each  strategy. 

‘Predicted  performance  is  computed  using  the  same  evaluation  variable 
and  same  weights  for  each  job  across  all  experimental  conditions. 

Source:  Johnson,  Zeidner,  and  Scholarios  (1990) 
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This  cross-validation  design  controls  two  distinct  sources  of  correlated  error.  A 
traditional  source  of  error  would  occur  if  the  validities  and  correlation  matrices  used  to  obtain 
assignment  weights  were  based  on  the  same  sample  as  those  used  in  the  simulations.  This 
traditional  source  of  error  was  controlled  in  this  experiment  by  having  the  analysis  sample  for 
computing  the  assignment  weights  independent  of  the  20  cross  samples.  A  second  source  of 
error  would  occur  if  the  same  weights  used  for  assigning  entities  to  jobs  were  also  used  for 
evaluating  this  assignment.  The  use  of  the  same  weights  for  assignment  and  evaluation  treats 
one  type  of  error  component  as  gains  in  true  performance,  thus  overestimating  predicted 
performance.  For  Design  A,  this  second  source  of  error  was  controlled  by  computing  the 
assignment  weights  from  the  analysis  sample  and  computing  the  evaluation  weights  directly  from 
the  designated  population. 

For  Design  B,  it  was  not  practical  to  create  an  analysis  sample  in  the  same  manner  as 
described  in  Appendix  F  because  the  entity  sample  and  number  of  jobs  was  so  large.  For 
Design  B,  the  designated  population  values  were  used  in  the  job  clustering  and  in  the 
computation  of  assignment  weights.  There  were  20  independent  cross-validation  samples  that 
were  generated  with  sample  sizes  of  400.  However,  note  that  the  weights  used  for  evaluation 
(from  the  designated  population)  are  the  same  as  the  weights  used  for  assignment.  Thus,  the 
first  source  of  error  described  above  was  controlled  for  but  the  second  source  of  error  was  not. 
Consequently,  it  is  expected  that  the  MPP  results  from  Design  B  would  be  to  some  degree 
overestimates  of  what  MPP  would  be  if  all  sources  of  error  were  controlled. 

Although  we  do  not  have  separate  analysis  and  evaluation  samples  for  Design  B  the  use 
of  20  independent  cross-samples  permits  us  to  make  the  comparisons  we  include  in  Design  B. 
All  contrasts  in  which  the  correlated  error  between  assignment  and  analysis  variable  would  bias 
the  results  have  been  included  only  in  Design  A  where  this  type  of  correlated  error  has  been 
completely  eliminated.  We  believe  that  the  levels  of  the  independent  variables  contrasted  in 
Design  B  are  not  seriously  affected  by  the  degree  and  type  of  correlated  error  remaining  after 
the  traditional  "back  validity"  type  of  inflation  has  been  eliminated. 

A  model  sampling  study  designed  to  determine  the  effect  of  various  sized  analysis 
samples  have  on  MPP  after  assignment  has  been  initiated  by  the  first  two  authors  of  this  report. 
This  study  will  contrast  the  effect  on  MPP  of  constructing  the  analysis  of  validity  (job)  samples 
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ranging  from  half  the  size  of  those  in  the  concurrent  study  of  Project  A  to  those  several  times 
this  large.  Meanwhile,  the  results  of  this  study  can  be  compared  with  those  of  Whetzel  (1991) 
in  which  the  analysis  sample  was,  by  implication,  infinitely  large. 

2.  Repeated  Measures  Design 

Another  special  design  feature  used  in  this  research  is  a  repeated  measures  design.  The 
repeated  measures  design  chosen  for  this  research  is  one  in  which  each  of  the  20  cross-samples 
of  "individuals"  was  exposed  to  all  treatment  conditions.  This  design  helps  to  control  the  error 
variance  between  entities  and  helps  to  limit  the  number  of  entities  that  must  be  generated  for 
each  condition. 

There  are  many  common  disadvantages  associated  with  using  a  repeated  measures  design. 
However,  most  of  these  are  irrelevant  in  the  present  study  because  this  is  a  model  sampling 
experiment.  For  example,  with  repeated  measures  it  is  commonly  necessary  to  devise  elaborate 
randomized  block  designs  to  ensure  that  the  order  of  treatments  do  not  cause  confounding  due 
to  carry-over  effects  across  conditions  (e.g.,  from  practice,  fatigue,  transfer  of  training).  With 
an  artificially  generated  sample  of  test  scores  this  disadvantage  is  not  an  issue. 

Another  disadvantage  often  cited  with  repeated  measures  is  that  since  the  repeated 
measures  design  allows  a  smaller  sample  size,  it  also  decreases  the  accuracy  of  estimation 
because  the  population  of  subjects  is  not  as  well  represented  as  it  would  be  if  a  larger  sample 
were  used.  However,  in  this  research,  because  it  is  a  model  sampling  experiment  it  was  possible 
to  replicate  the  entire  experiment  20  times  thereby  increasing  the  sample  size.  This  replication 
of  all  conditions  20  times  forms  the  20  cross-samples  referred  to  throughout  this  discussion.  It 
is  important  that  20  cross-samples  are  used  in  this  model  sampling  experiment  instead  of  one 
large  cross-sample  because  optimal  assignment  with  an  extremely  large  sample  size  is 
prohibitively  expensive.  Optimal  assignment  with  smaller  samples,  20  different  times,  is  much 
more  practical  and  feasible. 

3.  Experimental  Design 

This  research  was  designed  to  enable  an  analysis  of  the  effects  of  restructuring  job 
families  on  classification  efficiency  by  increasing  the  number  of  job  families  and  by  changing 
the  composition  of  the  jobs  within  these  families.  Design  A  utilized  the  Project  A  concurrent 
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validation  data  and  the  "McLaughlin"  data  for  18  jobs.  Design  B  utilized  only  the 
"McLaughlin"  data  for  an  expanded  set  of  60  jobs. 

Design  A  can  be  divided  into  three  components.  The  first  component  forms  the  basic 
research  design  (Design  A-l),  with  two  other  components  forming  baseline  conditions  based  on 
the  operational  job  families  currently  used  by  the  U.S.  Army  (Designs  A-2  and  A-3). 
a.  Design  A-l;  Basic  Research  Design 

The  basic  research  design  consists  of  three  independent  variables  that  combine  to  form 
18  experimental  conditions,  each  one  represented  by  a  cell  containing  20  cross-samples.  These 
three  independent  variables  include  a  clustering  methods  factor,  a  number  of  job  families  factor, 
and  a  data  source  factor. 

The  clustering  methods  factor  consists  of  two  methods,  a  method  for  clustering  jobs  to 
maximize  classification  efficiency  (CE  method)  and  a  method  for  clustering  jobs  to  maximize 
selection  efficiency  (SE  method).  The  CE  method  involves  minimizing  the  reduction  in  Horst’s 
(1954)  differential  index  during  each  iteration  in  which  jobs  are  formed  into  job  families.  The 
SE  method  maximizes  selection  efficiency  by  ensuring  that  each  job  family  has  the  maximum 
obtainable  average  multiple  correlation  coefficient  (R). 

The  number  of  job  families  factor  consists  of  three  levels  (6,  9,  and  12  families).  The 
number  of  job  families  at  each  of  these  three  levels  was  designed  to  represent  the  current  number 
of  operational  job  families,  3  less,  and  3  more. 

The  data  source  variable  represents  three  distinct  sources  of  predictor  and  criterion  data: 
(a)  the  experimental  Project  A  concurrent  validation  test  battery  (Exp.  Batt.-A)  with  the  CTP 
criterion,  (b)  the  standard  ASVAB  test  battery  from  the  Project  A  concurrent  validation  data  set 
(ASVAB-A)  with  the  CTP  criterion,  and  (c)  the  standard  ASVAB  test  battery  from  the 
"McLaughlin"  data  set  (ASVAB-McL)  with  the  SQT  and  training  scores  criteria.  The 
experimental  Project  A  test  battery  consists  of  the  ASVAB  tests  plus  20  experimental  predictors. 
This  experimental  test  battery  represents  an  expansion  of  multidimensionality  in  the  predictor 
space.  The  experimental  battery  is  compared  to  the  standard  ASVAB  test  battery  from  Project 
A  to  determine  if  expanding  the  predictor  space  results  in  greater  MPP.  The  purpose  of 
including  the  "McLaughlin"  data  set  is  to  be  able  to  compare  the  SQT/training  score  criteria, 
frequently  criticized  as  inappropriate  for  personnel  selection  research,  with  the  specially 
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developed  Project  A  CTP  criterion.  Since  the  ASVAB-A  and  the  ASVAB-McL  conditions  are 
both  based  on  the  same  nine  ASVAB  tests  it  is  possible  to  attribute  differences  in  MPP  between 
these  two  conditions  to  the  different  criteria  measures. 

Table  1  shows  how  the  three  independent  variables  just  described  combine  to  form  the 
18  experimental  conditions. 

Table  1 

Design  A-l  :  Basic  Research  Design 


Clustering 

Methods 

Job 

Families 

Data  Source 

Exp. 

Batt.-A 

ASVAB-A 

ASVAB-McL 

SE 

6 

9 

12 

CE 

6 

9 

12 

b.  Design  A-2;  First  Baseline  Condition 

The  first  baseline  condition  forms  an  additional  design  feature  which  is  external  to  the 
above  factorial  design.  The  current  U.S.  Army  operational  nine-job-families  system  constitutes 
a  baseline  condition  to  be  contrasted  against  a  combined  SE  and  CE,  nine-job-families  condition. 
This  comparison  is  done  for  two  of  the  three  levels  of  the  data  source  factor  (Exp.  Batt.-A  and 
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ASVAB-A).  Thus,  this  design  will  have  a  2  x  2  factorial  structure  which  requires  that  two  new 
cells  be  created  (20  cross-samples  each)  for  the  new  conditicu.  The  other  two  cells  contain  data 
obtained  from  the  basic  research  design  discussed  previously.  Table  2  illustrates  Design  A-2. 


Table  2 

Design  A-2:  First  Baseline  Condition 


Data  Source 

Clustering  Methods 

Exp.Batt.-A 

ASVAB-A 

Empirical  (CE/SE) 

Operational 

The  U.S.  Army’s  job  family  system  constitutes  a  baseline  condition  because  it  allows  a 
comparison  of  the  Army’s  operational  job  family  structure  to  the  job  family  structures  developed 
empirically  in  this  research.  Note,  however,  that  in  this  design  all  of  the  assignment  variables 
are  FLS  composites  so,  although  the  operational  job  families  are  duplicated,  the  operational 
aptitude  area  composites  are  not.  Table  3  shows  the  current  U.S.  Army  operational  job  families 
and  indicates  which  job  family  each  of  the  18  jobs  included  in  Design  A  belongs. 
c.  Design  A-3:  Second  Baseline  Condition 

The  second  baseline  condition  consists  of  a  single  condition  (one  cell  of  20  cross- 
samples).  In  this  cell,  the  existing  U.S.  Army  aptitude  area  composites  are  used  as  assignment 
variables  instead  of  FLS  composites.  The  nine  operational  job  families  are  the  targets  of  the 
assignment.  This  condition,  then,  represents  the  current  composite  system  that  the  Army  uses 
for  selection  and  classification.  Thus,  it  is  possible  to  determine  the  magnitude  of  the  effects 
that  use  of  the  FLS  composites  have  on  MPP  in  comparison  to  the  current  composite  system. 
The  aptitude  area  composites  used  in  this  research  are  given  in  Appendix  B  (Table  B-2).  It  is 
predicted  that  use  of  the  aptitude  area  composites  for  assignment  to  the  existing  nine  job  families 
will  result  in  the  lowest  MPP  in  comparison  to  any  of  the  Project  A  cells  in  the  previous  two 
designs. 


Table  3 


Design  A  MQS  Grouped  into  Current  Operational  Job  Families 


Operational  Job  Families 

18  Design  A  MOS 

Clerical/ Administrative  (CL) 

71L  Admin  Specialist 

76W  Petroleum  Supply  Sp 

76Y  Unit  Supply  Sp 

Combat  (CO) 

11B  Infantryman 

12B  Combat  Engineer 

19E  M49-M60  Armor  Crmn 

Electronics  Repair  (EL) 

27E  TOW/Dragon  Repairer 

Field  Artillery  (FA) 

13B  Cannon  Crewman 

General  Maintenance  (GM) 

S5B  Ammunition  Specialist 

Mechanical  Maintenance  (MM) 

63B  Light  Wheel  Vehicle/ 

Power  Gen  Mechanic 

67N  Utility  Helicopter  Rep 

Operators/Food  (OF) 

16S  MANPADS  Crewman 

64C  Motor  Transport  Op 

94B  Food  Service  Sp 

Surveillance/Communication  (SC) 

3 1C  Single  Channel  Radio 

Operator 

Skilled  Technical  (ST) 

54E  Nuclear,  Biological, 
and  Chemical  Sp 

91 A  Medical  Specialist 

95B  Military  Police 

d.  Design  B 

Design  B  contains  two  independent  variables  including  a  clustering  methods  factor  and 
a  number  of  job  families  factor.  In  this  case,  the  number  of  job  families  is  expanded  to  9,  16, 
and  23.  This  is  possible  due  to  the  use  of  the  "McLaughlin"  data  with  60  jobs.  Table  4 
illustrates  the  conditions  of  Design  B. 

Table  4 
Design  B 


Job 

Clustering  Methods 

Families 

Empirical 

Operational 

9 

16 

23 

The  best  empirical  clustering  method  (CE  or  SE)  is  used  to  cluster  the  60  jobs  into  9, 
16,  and  23  job  families.  This  clustering  can  then  be  compared  to  the  current  Army  operational 
job  families.  The  9  job  family  operational  condition  represents  the  same  9  Army  job  families 
that  were  used  in  Design  A.  The  23  job  family  operational  condition  is  based  upon  the  Army’s 
Career  Management  Fields  (CMF).  The  Army  currently  has  33  CMF  categories  in  which  jobs 
are  grouped,  but  for  the  present  research  it  was  only  possible  to  include  jobs  from  23  of  the 
CMF  with  the  job  sample  available  in  the  "McLaughlin"  database.  The  16  job  family 
operational  condition  was  developed  for  the  purposes  of  this  research  as  a  combination  of  the 
Army’s  nine  job  family  aptitude  areas  and  the  CMF.  In  determining  these  16  job  families, 
certain  CMF  categories  were  combined  taking  the  9  aptitude  areas  into  account  along  with  the 
number  of  jobs  in  the  CMF  categories  and  the  similarity  of  the  jobs  in  the  combining  categories. 
Thus,  in  several  cases  where  a  single  job  represented  a  CMF  category,  this  job  was  grouped 
with  another  CMF  category  that  contained  jobs  in  the  same  aptitude  area.  In  addition,  some 
CMF  categories  were  combined  based  on  the  similarities  of  the  jobs  and  similarities  of  the 
aptitude  areas.  For  example,  CMF  63  Mechanical  Maintenance  and  CMF  67  Aircraft 
Maintenance  were  combined  because  the  jobs  were  all  maintenance  jobs  that  cut  across  the  same 
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two  aptitude  areas  (general  mechanical  and  mechanical  maintenance).  Tables  5,  6,  and  7  present 
the  60  jobs  used  in  Design  B  grouped  into  the  9,  16,  and  23  Army  operational  job  families. 
Each  table  contains  columns  of  numbers  following  the  jobs  to  indicate  where  the  jobs  are 
grouped  in  the  other  job  family  structures. 


C.  Procedure 

1.  Job  Clustering  Procedures 

Jobs  were  to  be  clustered  into  job  families  using  two  different  clustering  algorithms.  One 
clustering  algorithm  attempted  to  maximize  classification  efficiency  by  minimizing  the  reduction 
in  Horst’s  (1954)  differential  index  during  each  iteration  in  which  jobs  are  formed  into  job 
families.  The  second  algorithm  attempted  to  maximize  selection  efficiency  by  ensuring  that  each 
job  family  formed  has  the  maximum  obtainable  average  multiple  correlation  coefficient  (R). 
a.  Clustering  to  Maximize  Classification  Efficiency  (CE) 

The  algorithm  that  clusters  jobs  to  maximize  classification  efficiency  is  called  the 
classification-efficient  (CE)  clustering  method.  This  method  follows  a  series  of  iterative  steps 
beginning  with  the  input  of  an  F  matrix  (see  Appendix  G  for  the  classification-efficient 
program).  The  F  matrix  represents  a  principle  components  solution  from  the  factorization  of 
the  joint  predictor-criterion  space,  Cp,  where  Cp  is  calculated  by: 

Cp  =  V  (RO1  V’ 

For  Design  A,  the  resulting  F  matrix  is  either  an  18  by  18  matrix  (for  Experimental  Battery- 
Project  A)  or  an  18  by  9  matrix  (ASVAB-Project  A  and  "McLaughlin").  In  other  words,  18 
jobs  (rows)  by  either  18  or  9  factors  (columns).  For  Design  B,  the  resulting  F  matrix  is  a  60 
by  9  matrix  (60  jobs  with  9  factors). 

The  column  means  of  this  F  matrix  are  then  calculated  and  these  column  means  are 
subtracted  from  each  corresponding  column  element  of  the  F  matrix  to  form  a  matrix  of 
deviations,  G.  Then,  this  G  matrix  is  post-mulitiplied  by  its  transpose,  GG’,  and  the  diagonal 
elements  of  the  resulting  matrix  are  extracted  to  form  a  vector  Dg.  The  weighted  sum  of  all  of 
the  elements  of  this  Dg  vector  are  equal  to  an  average  Hd  (Horst’s  differential  index)  across  job 
families.  For  the  first  iteration  the  weights  to  be  applied  to  the  Dg  vector  will  all  be  one,  but 


Table  5 


Design  B  MOS  Grouped  into  9  Operational  Job  Families 


9  16  23 


#  MOS 

n 

APTITUDE 

AREA(AA) 

COMBINED 

CMF 

CMF 

1.  CLERICAL/ADMINISTRATIVE 

71L  Adninistrative  Sp* 

2824 

1 

9 

13 

71M  Chapel  Activities  Sp 

182 

1 

9 

13 

73C  Finance  Specialist 

688 

1 

9 

13 

75B  Personnel  Admin  Sp 

1061 

1 

9 

13 

76C  Eq  Rec  &  Parts  Sp 

331 

1 

11 

15 

76V  Mat  Stor  &  Hdlg  Sp 

216 

1 

11 

15 

76Y  Unit  Supply  Sp* 

1149 

1 

11 

15 

76U  Petroleun  Supply  Sp* 

664 

1 

11 

16 

71N  Traffic  Mgmt  Coordinator 

163 

1 

12 

17 

2.  COMBAT 

1 IB  Infantryman* 

6355 

2 

1 

1 

11C  Indirect  Fire  Infmn 

1494 

2 

1 

1 

1 1 H  HV  Anti-Armor  Wpn  Infn 

979 

2 

1 

1 

12B  Combat  Engineer* 

3109 

2 

1 

2 

12F  Engineer  Tracked  Crmn 

151 

2 

1 

2 

190  Cavalry  Scout 

1249 

2 

1 

5 

19E  M48-M60  Armor  Crmn* 

3297 

2 

1 

5 

3.  ELECTRONICS  REPAIR 

27E  TOW/Dragon  Rep* 

363 

3 

3 

6 

27F  Vulcan  Repairer 

130 

3 

3 

6 

31M  Multichannel  Comm  Eq  Op 

2482 

3 

4 

7 

31N  Tactical  Ckt  Con 

189 

3 

4 

7 

31V  Tac  Comm  Sysop/Mech 

515 

3 

4 

7 

36C  Wire  Sys  Inst/Op 

499 

3 

4 

7 

4.  FIELD  ARTILLERY 

13B  Cannon  Crmn* 

6575 

4 

2 

3 

13F  Fire  Support  Sp 

693 

4 

2 

3 

5.  GENERAL  MECHANICAL 

62E  HV  Const  Equip  Rep 

202 

5 

5 

8 

62F  Lifting/Loading  Eq  Op 

129 

5 

5 

8 

55B  Ammunition  Sp* 

288 

5 

7 

10 

520  Power  Generation  Equip  Rep 

178 

5 

8 

11 

68J  Aircraft  FC  Repairer 

148 

5 

8 

12 

43E  Parachute  Rigger 

100 

5 

11 

15 

57H  Cargo  Specialist 

272 

5 

12 

17 

(Continued) 
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Table  5  (Continued) 


9  16  23 


#  HOS 

n 

APTITUDE 

AREA(AA) 

COMBINED 

CNF 

CNF 

6.  MECHANICAL  MAINTENANCE 

12C  Bridge  Crewman 

450 

6 

1 

2 

62B  Construction  Equip  Rep 

233 

6 

8 

11 

63B  Lt  Wh  Veh/Pwr  Gen  Hech* 

1495 

6 

8 

11 

63H  Track  Veh  Repairer 

335 

6 

8 

11 

63N  M60A1/A3  Tank  Sys  Mech 

286 

6 

8 

11 

63U  Wheel  Veh  Mechanic 

180 

6 

8 

11 

67N  Utility  Hel  Repairer* 

511 

6 

8 

12 

67V  OBN/Scout  Hel  Rep 

294 

6 

8 

12 

68G  Aircraft  Structural  Rep 

125 

6 

8 

12 

7.  OPERATORS/ F000 

ISO  Lance  Crmb/MLRS  Sgt 

281 

7 

2 

3 

16S  MANPADS  Crewmember* 

596 

7 

2 

4 

64C  Motor  Transport  Op* 

3681 

7 

12 

17 

94B  Food  Service  Sp* 

3943 

7 

15 

20 

8.  SURVEILLANCE/COMMUNICATION 

05C  Radio  TT  Operator* 

2393 

8 

4 

7 

72E  Combat  Telecom  Center  Op 

569 

8 

4 

7 

9.  SKILLED  TECHNICAL 

13E  Cannon  Fire  Direction  Sp 

627 

9 

2 

3 

82C  Field  Artillery  Surveyor 

434 

9 

2 

3 

54E  NBC  Specialist* 

113 

9 

6 

9 

74D  Computer/Tape  Writer 

132 

9 

10 

14 

74F  Programmer/Analyst 

95 

9 

10 

14 

91B  Medical  Specialist* 

783 

9 

13 

18 

91E  Dental  Specialist 

203 

9 

13 

18 

91P  X-Ray  Specialist 

159 

9 

13 

18 

928  Medical  Lab  Sp 

310 

9 

13 

18 

93H  Air  Traffic  Con  Tower  Op 

114 

9 

14 

19 

95B  Military  Police* 

4516 

9 

16 

21 

968  Intelligence  Analyst 

218 

9 

16 

22 

05H  Elec  War/SIGINT  INTER_1MC 

171 

9 

16 

23 

98C  Elec  War/SIGINT  Analyst 

186 

9 

16 

23 

*  =  MOS  for  Design  A 
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Table  6 


Design  B  MOS  Grouped  into  16  Operational  Job  Families 

9  16  23 


#  MOS 

n 

APTITUDE 

AREA(AA) 

COMBINED 

CMF 

CMF 

Job  Family  1 

11B  Infantryman* 

6355 

2 

1 

1 

11C  Indirect  Fire  Infmn 

1494 

2 

1 

1 

1 1 H  HV  Anti-Armor  Upn  Infn 

979 

2 

1 

1 

12B  Combat  Engineer* 

3109 

2 

1 

2 

12F  Engineer  Tracked  Crmn 

151 

2 

1 

2 

190  Cavalry  Scout 

1249 

2 

1 

5 

19E  M48-M60  Armor  Crmn* 

3297 

2 

1 

5 

12C  Bridge  Crewman 

450 

6 

1 

2 

Job  Family  2 

138  Cannon  Crmn* 

6575 

4 

2 

3 

13F  Fire  Support  Sp 

693 

4 

2 

3 

150  Lance  Crmb/MLRS  Sgt 

281 

7 

2 

3 

16S  MANPADS  Crewmember* 

596 

7 

2 

4 

13E  Cannon  Fire  Direction  Sp 

627 

9 

2 

3 

82C  Field  Artillery  Surveyor 

434 

9 

2 

3 

Job  Family  3 

27E  TOW/Dragon  Rep* 

363 

3 

3 

6 

27F  Vulcan  Repairer 

130 

3 

3 

6 

Job  Fami ly  4 

31M  Multichannel  Comm  Eq  Op 

2482 

3 

4 

7 

31N  Tactical  Ckt  Con 

189 

3 

4 

7 

31V  Tac  Comm  Sysop/Mech 

515 

3 

4 

7 

36C  Wire  Sys  Inst/Op 

499 

3 

4 

7 

05C  Radio  TT  Operator* 

2393 

8 

4 

7 

72E  Combat  Telecom  Center  Op 

569 

8 

4 

7 

Job  Family  5 

62E  HV  Const  Equip  Rep 

202 

5 

5 

8 

62F  Lifting/Loading  Eq  Op 

129 

5 

5 

8 

Job  Family  6 

54E  NBC  Specialist* 

113 

9 

6 

9 

Job  Family  7 

55B  Ammunition  Sp* 

288 

5 

7 

10 

Job  Family  8 

520  Power  Generation  Equip  Rep 

178 

5 

8 

11 

68J  Aircraft  FC  Repairer 

148 

5 

8 

12 

62B  Construction  Equip  Rep 

233 

6 

8 

11 

63B  Lt  Wh  Veh/Pwr  Gen  Mech* 

1495 

6 

8 

11 

63H  Track  Veh  Repairer 

335 

6 

8 

11 

63N  M60A1/A3  Tank  Sys  Mech 

286 

6 

8 

11 

63W  Wheel  Veh  Mechanic 

180 

6 

8 

11 

67N  Utility  Hel  Repairer* 

511 

6 

8 

12 

67V  OBN/Scout  Hel  Rep 

294 

6 

8 

12 

68G  Aircraft  Structural  Rep 

125 

6 

8 

12 
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Table  6  (Continued) 


*  MOS 


Job  Fami  ly  9 

71 L  Administrative  Sp* 

71H  Chapel  Activities  Sp 
73C  Finance  Specialist 
758  Personnel  Admin  Sp 
Job  Family  10 

740  Computer/Tape  Writer 
74F  Programmer/Analyst 
Job  Family  11 

76C  Eq  Rec  &  Parts  Sp 
76V  Mat  Stor  &  Hdlg  Sp 
76Y  Unit  Supply  Sp* 

76U  Petroleun  Supply  Sp* 

43E  Parachute  Rigger 
Job  Family  12 

71M  Traffic  Mgmt  Coordinator 
57H  Cargo  Specialist 
64C  Motor  Transport  Op* 

Job  Family  13 

91B  Medical  Specialist* 

91E  Dental  Specialist 
91P  X-Ray  Specialist 
92B  Medical  Lab  Sp 
Job  Family  14 

93H  Air  Traffic  Con  Tower  Op 
Job  Family  15 

94B  Food  Service  Sp* 

Job  Family  16 

95B  Military  Police* 

96B  Intelligence  Analyst 
05H  Elec  Uar/SIGINT  INTERJMC 
98C  Elec  Uar/SIGINT  Analyst 


n 

9 

APT  I TU0E 

AREA(AA) 

16 

COMBINED 

CHF 

23 

CMF 

2824 

1 

9 

13 

182 

1 

9 

13 

688 

1 

9 

13 

1061 

1 

9 

13 

132 

9 

10 

14 

95 

9 

10 

14 

331 

1 

11 

15 

216 

1 

11 

15 

1149 

1 

11 

15 

664 

1 

11 

16 

100 

5 

11 

15 

163 

1 

12 

17 

272 

5 

12 

17 

3681 

7 

12 

17 

783 

9 

13 

18 

203 

9 

13 

18 

159 

9 

13 

18 

310 

9 

13 

18 

114 

9 

14 

19 

3943 

7 

15 

20 

4516 

9 

16 

21 

218 

9 

16 

22 

171 

9 

16 

23 

186 

9 

16 

23 

*  =  MOS  for  Oesign  A 
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Table  7 


Design  B  MOS  Grouped  into  23  Operational  Job  Families 

9  16  23 


APTITUDE 

COMBINED 

#  MOS 

n 

AREA(AA) 

CMF 

CMF 

1.  INFANTRY  (11) 

1 IB  Infantryman* 

6355 

2 

1 

1 

11C  Indirect  Fire  Infmn 

1494 

2 

1 

1 

1 1 H  HV  Anti-Armor  Wpn  Infn 

979 

2 

1 

1 

2.  COMBAT  ENGINEERING  (12) 

12B  Combat  Engineer* 

3109 

2 

1 

2 

12F  Engineer  Tracked  Crmn 

151 

2 

1 

2 

12C  8ridge  Crewman 

450 

6 

1 

2 

3.  FIELD  ARTILLERY  (13) 

13B  Cannon  Crmn* 

6575 

4 

2 

3 

13F  Fire  Support  Sp 

693 

4 

2 

3 

ISO  Lance  Crmb/MLRS  Sgt 

281 

7 

2 

3 

13E  Cannon  Fire  Direction  Sp 

627 

9 

2 

3 

82C  Field  Artillery  Surveyor 

434 

9 

2 

3 

4.  AIR  DEFENSE  ARTILLERY  (16) 

16S  MANPADS  Crewmember* 

596 

7 

2 

4 

5.  ARMOR  (19) 

19D  Cavalry  Scout 

1249 

2 

1 

5 

19E  M48-H60  Armor  Crmn* 

3297 

2 

1 

5 

6.  LAND  COMBAT/AD  SYS  INTRM  MAINTENANCE  (27) 

27E  TOU/Oragon  Rep* 

363 

3 

3 

6 

27F  Vulcan  Repairer 

130 

3 

3 

6 

7.  SIGNAL  OPERATIONS  (31) 

31M  Multichannel  Comm  Eq  Op 

2482 

3 

4 

7 

31N  Tactical  Ckt  Con 

189 

3 

4 

7 

31V  Tac  Comm  Sysop/Hech 

515 

3 

4 

7 

36C  Wire  Sys  Inst/Op 

499 

3 

4 

7 

05C  Radio  TT  Operator* 

2393 

8 

4 

7 

72E  Combat  Telecom  Center  Op 

569 

8 

4 

7 

8.  GENERAL  ENGINEERING  (51) 

62E  HV  Const  Equip  Rep 

202 

5 

5 

8 

62F  Lifting/Loading  Eq  Op 

129 

5 

5 

8 

9.  CHEMICAL  (54) 

54E  NBC  Specialist* 

113 

9 

6 

9 

10.  AMMUNITION  (55) 

55B  Ammunition  Sp* 

288 

5 

7 

10 

11.  MECHANICAL  MAINTENANCE  (63) 

52D  Power  Generation  Equip  Rep 

178 

5 

8 

11 

628  Construction  Equip  Rep 

233 

6 

8 

11 

638  Lt  Wh  Veh/Pwr  Gen  Mech* 

1495 

6 

8 

11 

63H  Track  Veh  Repairer 

335 

6 

8 

11 

63N  M60A1/A3  Tank  Sys  Mech 

286 

6 

8 

11 

63W  Wheel  Veh  Mechanic 

180 

6 

8 

11 

(Continued) 
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Table  7  (Continued) 


9  16  23 


#  MOS 

n 

APTITUDE 

AREA(AA) 

COMBINED 

CMF 

CMF 

12.  AIRCRAFT  MAINTENANCE  (67) 

68J  Aircraft  FC  Repairer 

168 

5 

8 

12 

67N  Utility  Hel  Repairer* 

511 

6 

8 

12 

67V  OBN/Scout  Hel  Rep 

296 

6 

8 

12 

68G  Aircraft  Structural  Rep 

125 

6 

8 

12 

13.  ADMINISTRATION  (71) 

71 L  Administrative  Sp* 

2826 

1 

9 

13 

71M  Chapel  Activities  Sp 

182 

1 

9 

13 

73C  Finance  Specialist 

688 

1 

9 

13 

75B  Personnel  Admin  Sp 

1061 

1 

9 

13 

16.  AUTOMATIC  DATA  PROCESSING  (76) 

760  Computer /Tape  Writer 

132 

9 

10 

16 

76F  Programmer/Analyst 

95 

9 

10 

16 

15.  SUPPLY  AND  SERVICE  (76) 

76C  Eq  Rec  t  Parts  Sp 

331 

1 

11 

15 

76V  Mat  Stor  &  Hdlg  Sp 

216 

1 

11 

15 

76Y  Unit  Supply  Sp* 

1169 

1 

11 

15 

63E  Parachute  Rigger 

100 

5 

11 

15 

16.  PETROLEUM  AND  WATER  (77) 

76W  Petroleum  Supply  Sp* 

666 

1 

11 

16 

17.  TRANSPORTATION  (88) 

71N  Traffic  Mgmt  Coordinator 

163 

1 

12 

17 

57H  Cargo  Specialist 

272 

5 

12 

17 

66C  Motor  Transport  Op* 

3681 

7 

12 

17 

18.  MEDICAL  (91) 

91B  Medical  Specialist* 

783 

9 

13 

18 

91E  Dental  Specialist 

203 

9 

13 

18 

91P  X-Ray  Specialist 

159 

9 

13 

18 

92B  Medical  Lab  Sp 

310 

9 

13 

18 

19.  AVIATION  OPERATION 

93H  Air  Traffic  Con  Tower  Op 

116 

9 

16 

19 

20.  FOOD  SERVICE  (96) 

96B  Food  Service  Sp* 

3963 

7 

15 

20 

21.  MILITARY  POLICE  (96) 

95B  Military  Police* 

6516 

9 

16 

21 

22.  MILITARY  INTELLIGENCE  (96) 

968  Intelligence  Analyst 

23.  ELECTRONIC  WARFARE/CRYPTOLOGIC  OP 

218 

(98) 

9 

16 

22 

OSH  Elec  War/SIGINT  INTERJMC 

171 

9 

16 

23 

98C  Elec  War/SIGINT  Analyst 

186 

9 

16 

23 

*  =  MOS  for  Design  A 
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as  jobs  are  formed  into  job  families,  the  weights  become  equal  to  the  number  of  jobs  in  each 
developing  job  family.  This  process  ensures  that  average  H,,  will  always  correspond  to  all  of 
the  jobs  in  the  evolving  job  families. 

The  next  step  in  this  algorithm  utilizes  the  Df  vector  to  form  an  A  matrix.  This  A  matrix 
consists  of  the  weighted  sum  of  all  possible  combinations  of  the  elements  of  the  D,  vector.  Each 
element  of  the  A  matrix  is  calculated  using  the  following  formula: 

a*  =  niCdi)  +  nj(dj) 

where, 

d  =  an  element  of  D, 

n  =  number  of  jobs  in  the  ith  or  jth  job  family 

Thus,  the  A  matrix  is  a  m  by  m  square  matrix,  where  m  is  the  number  of  job  families. 
Conceptually,  the  A  matrix  represents  the  contribution  of  the  jobs  in  each  pair  of  evolving  job 
families  to  H*. 

The  next  step  in  this  algorithm  utilizes  the  F  matrix  described  previously  to  form  a  B 
matrix.  For  each  column  of  F,  the  ith  and  jth  factor  loadings  are  summed,  weighted  by  and 
n3  to  form  the  weighted  sums.  These  sums  are  divided  by  fa  +  nj).  The  column  mean  is  then 
subtracted  and  each  difference  is  squared.  These  calculations  are  repeated  for  each  column  of 
F  and  all  of  these  elements  are  summed  and  mulitiplied  by  fa  +  n^  to  form  the  ith  and  jth 
elements  of  the  B  matrix.  This  process  is  repeated  for  all  possible  combinations  of  elements  in 
the  F  matrix.  The  notation  for  these  calculations  can  be  represented  as  follows: 

B*  =  [((fafu  +  n^j/fa  +  rij))  -  ct)2  + 

((fak  +  n/j2)/fa  +  n^))  -  Cj)2  + 

•  •  •  ((faf*  +  Hjf-J/fai  +  n^))  -  cj2]fai  +  n) 

where, 

f  =  an  element  of  the  F  matrix 
n  =  number  of  jobs  in  the  ith  or  jth  job  family 
c  =  column  means  from  the  F  matrix 
m  =  number  of  job  families 

Conceptually,  the  B  matrix  acts  as  a  trial  deviation  matrix  that  indicates  how  much  of  a 
reduction  in  Hj  there  is  when  any  two  job  families  for  that  iteration  are  combined. 
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The  final  step  in  the  algorithm  is  to  subtract  all  of  the  elements  in  the  B  matrix  from  all 
of  the  elements  in  the  A  matrix  (i.e. ,  A  -  B)  to  form  a  D  matrix.  The  smallest  element  in  this 
D  matrix  represents  the  two  jobs  (or  job  families)  that  when  combined  will  minimize  the 
reduction  in  Hd.  Thus,  the  two  job  families  corresponding  to  the  smallest  value  in  the  D  matrix 
are  chosen  and  the  two  rows  representing  these  job  families  in  the  F  matrix  are  averaged 
together  to  form  a  new  F  matrix.  This  new,  smaller  F  matrix  is  then  used  to  begin  the  next 
iteration.  It  is  important  to  note  that  the  same  column  means  of  the  F  matrix  from  the  very  first 
iteration  (i.e.,  all  jobs)  are  used  for  every  iteration.  Thus,  each  new  iteration  starts,  not  with 
the  recalculation  of  the  column  means,  but  with  the  recalculation  of  the  G  matrix. 

Depending  upon  the  condition,  the  number  of  iterations  continues  until  either  6,  9,  or  12 
job  families  are  formed  in  Design  A  and  until  9,  16,  or  23  job  families  are  formed  in  Design 
B.  H,,  is  calculated  during  each  iteration  from  the  D(  vector  so  that  it  is  possible  to  track  the 
amount  it  is  reduced  with  each  iteiation. 
b.  Clustering  to  Maximize  Selection  Efficiency  (SE) 

The  algorithm  that  clusters  jobs  to  maximize  predictive  validity  is  called  the  selection- 
efficient  (SE)  clustering  method.  This  algorithm  represents  a  heuristic  that  attempts  to  provide 
a  practical  way  of  obtaining  the  highest  possible  predictive  validity  in  a  set  of  job  clusters.  The 
absolute  optimal  method  for  forming  selection-efficient  job  clusters  would  be  to  evaluate  every 
possible  combination  of  jobs  in  terms  of  predictive  validity.  This  would  require  calculating  R2 
values  for  millions  of  different  combinations  which  is  beyond  the  practical  resources  for  this 
research.  Instead,  an  algorithm  was  developed  that  utilizes  two  stages  in  an  attempt  to  obtain 
the  highest  predictive  validity  possible  (see  Appendix  H  for  the  selection-efficient  clustering 
program).  The  descriptions  contained  in  this  section  are  limited  to  describing  the  selection- 
efficient  procedures  for  clustering  the  18  jobs  in  Design  A.  This  clustering  procedure  was  not 
used  in  Design  B. 

In  the  first  stage,  the  validity  matrix  (V)  from  the  analysis  sample  is  used  to  determine 
an  initial  combination  of  jobs  into  job  families.  Each  row  of  V  represents  the  validities 
pertaining  to  a  job.  Depending  upon  the  condition  (6,  9,  or  12  job  families),  differing  numbers 
of  rows  are  averaged  to  form  single  validity  vectors.  For  example,  for  the  6  job  family 
condition,  all  possible  combinations  (816  combinations)  of  three  jobs  are  averaged.  The  result 
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is  a  new  816  x  29  validity  matrix  (V,)  in  which  each  row  represents  a  different  three-job 
combination.  For  each  of  these  rows,  V(Rt)',V’  is  calculated  to  obtain  R2  (R*  is  the  matrix  of 
predictor  intercorrelations  from  the  analysis  sample).  An  iterative  procedure  is  then  performed 
in  which  the  three-job  group  with  the  largest  R  is  selected  first,  then  the  next  largest  R  that  does 
not  involve  the  three  jobs  in  the  first  selected  family,  etc.  until  six  non-overlapping  families  have 
been  located.  Thus,  for  the  6  job  families  condition,  all  18  jobs  are  covered  by  the  six 
groupings  of  three  jobs  each.  For  the  9  job  families  condition,  all  possible  combinations  (153 
combinations)  of  two  jobs  will  be  averaged  forming  V2  and  then  the  iteration  procedure 
described  above  commences.  For  the  12  job  families  condition,  as  many  two  job  combinations 
as  possible  will  be  formed,  but  some  of  the  final  job  groupings  will  contain  only  one  job  (V3). 

In  the  second  stage,  the  initial  job  groupings  formed  in  the  first  stage  are  shredded  to 
determine  if  other  job  combinations  result  in  higher  average  multiple  R’s.  For  example,  for  the 
6  job  family  condition,  beginning  with  the  triplet  with  the  lowest  average  R,  each  job  in  that 
triplet  will  be  considered  with  every  other  grouping  and  a  new  average  R  calculated  using  the 
formula:  trfVCR^V’Kl/m).  Of  these  15  trials,  the  trial  combination  with  the  greatest  overall 
average  R  will  be  selected.  This  shredding  process  will  be  repeated  with  the  next  set  of  triplets 
(and  eventually  sets  of  doubles)  until  no  substantial  increase  in  average  R  is  obtained.  However, 
note  that  for  each  of  the  conditions  there  must  always  be  at  least  one  job  in  each  of  the  job 
families.  In  addition,  once  a  job  cluster  has  had  other  jobs  added  to  it,  that  job  cluster  no  longer 
becomes  eligible  to  be  shredded.  This  job  cluster  does  continue  to  be  eligible  to  have  jobs  added 
to  it. 

In  this  way,  selection-efficient  job  family  clusters  are  created.  The  composition  of  the 
job  family  clusters  will  differ  depending  upon  whether  there  are  6,  9,  or  12  job  families.  Note 
also  that  the  compcv-aon  of  the  clusters  will  differ  depending  upon  the  data  source. 

2.  Cross-Sample  Generation  of  Synthetic  Scores 

For  this  research,  20  cross-samples  were  generated  using  model  sampling  techniques  for 
both  Designs  A  and  B.  The  goal  in  generating  these  cross-samples  is  to  obtain  predicted 
performance  scores  (LSEs)  for  all  entities  in  every  job  family.  The  procedure  for  accomplishing 
this  generation  involves  four  stages  which  are  discussed  below: 

a.  the  generation  of  random  normal  deviates; 
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b.  the  transformation  of  normal  deviates  into  test  scores  simulating  the 
characteristics  of  the  test  scores  in  the  designated  population; 

c.  the  transformation  of  test  scores  into  predicted  performance  scores 
corresponding  to  each  job  family  for  use  as  assignment  variables; 

d.  elimination  of  all  performance  scores  below  a  cutting  score  on  the 

AFQT  which  eliminates  25%  of  all  generated  entities  (to  simulate  selection). 
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In  the  first  stage,  a  uniformly  distributed  random  sequence  of  numbers  ranging  from  0 
to  1,  with  an  approximate  mean  of  0.5  and  a  variance  of  0.0833  was  produced  using  a  pseudo¬ 
random  number  generator  (Appendix  I  contains  the  program  for  the  random  number  generator). 
The  choice  of  a  random  number  generator  routine  was  based  on  evidence  documenting  efficient 
implementation  and  empirical  tests  of  the  randomness  of  the  program’s  output  (Park  &  Miller, 
1988).  A  clearly  defined  algorithm,  initial  parameters,  and  a  recorded  initial  seed  allow 
replication  of  the  experiment  (see  Appendix  I,  Tables  1-1  and  1-2  for  the  initialization  seeds  used 
for  this  study).  The  optimal  multipliers  for  producing  the  number  sequence  were  based  on 
Fishman  and  Moore’s  (1986)  recommendations.  Thus,  the  potential  problems  inherent  in  a 
random  number  generator  are  minimized  by  careful  selection  of  routines  and  inputs. 

The  sequence  of  uniformly  distributed  random  numbers  was  transformed  into  a 
distribution  of  normal  variables  by  calculating  the  expected  mean  and  dividing  by  expected 
values  to  give  a  mean  of  0  and  standard  deviation  of  1.0.  These  calculations  produced  a  matrix 
of  normal  deviates,  X„,  of  order  N  by  n,  where  N  is  the  number  of  entities  (individuals)  and  n 
is  the  number  of  simulated  scores  representing  the  full  set  predictors.  For  Design  A,  one  sample 
of  N=264  and  n=29  was  generated  for  each  of  twenty  separate  cross-samples.  For  Design  B, 
one  sample  of  N=400  and  n=9  was  generated  for  each  of  twenty  separate  cross-samples. 
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The  aim  at  this  second  stage  was  to  transform  the  matrix  X„  into  a  matrix  of  test  scores 
(Y)  for  each  of  the  experimental  conditions.  The  generated  test  scores  are  required  to  have 
expected  covariances  equivalent  to  those  of  the  population  which  are  represented  by  the 
intercorrelations  in  R,,  i.e.,  E(1/N(Y’Y))  =  R,.  A  Gramian  factor  solution  (Ft  =  ADI/2A’, 


where  A  and  D  are  the  eigenvectors  and  eigenvalues  of  the  population  predictor 
intercorrelations,  respectively),  was  used  to  transform  the  matrix  X,  into  a  matrix  of  test  scores 
(Y).  Thus,  Y  was  generated  by  Y  =  XflFt.  In  Design  A,  for  the  "McLaughlin"  and  Project  A- 
ASVAB  conditions,  only  the  first  nine  test  scores  (corresponding  to  the  nine  ASVAB  tests)  are 
retained  for  the  next  stage. 

c.  Transformation  of  Test  Scores  to  FLS  Composites 

The  aim  in  this  third  stage  was  to  generate  an  N  by  m  matrix  (Z)  of  predicted 
performance  scores  to  be  used  as  assignment  variables,  where  N  is  the  number  of  entities  and 
m  is  the  number  of  jobs.  Thus,  this  Z  matrix  contained  the  predicted  performance  of  each  entity 
in  every  job  family.  A  transformation  matrix  of  beta  weights,  W  =  R^V’,  was  computed  using 
the  analysis  sample  data  (for  Design  B  the  population  data  was  utilized).  A  different  V  matrix 
was  used  in  each  of  the  18  conditions  because  each  condition  represents  a  different  set  of  job 
families.  The  weights  are  applied  to  the  Y  matrices  of  cross-samples  so  that  the  calculation  of 
predicted  performance  scores  is  accomplished  by  Z  =  YW. 

d.  Selection  gf  Entities  by  the  afqt  Scqis 

Within  each  sample,  a  selection  ratio  of  0.75  was  applied  based  upon  a  ranking  on 
synthetic  AFQT  scores.  Thus,  entities  with  the  lowest  25  %  of  the  AFQT  scores  were  dropped 
from  the  analysis  and  not  considered  for  assignment.  For  each  cross-sample  in  Design  A,  out 
of  the  264  entities  generated,  198  entities  were  optimally  assigned.  For  Design  B,  out  of  the 
400  entities  generated,  300  entities  were  optimally  assigned.  The  AFQT  score  is  calculated  by 
the  formula  containing  the  ASVAB  tests  Arithmetic  Reasoning,  Numerical  Operations,  and 
Verbal  Ability  (Appendix  B,  Table  B-2).  Note  that  the  Army’s  computation  of  the  AFQT  has 
recently  been  modified  to  included  Mathematical  Knowledge  instead  of  the  weighted  Numerical 
Operations  test  (see  Welsh,  Kucinkas,  &  Curran,  1990).  The  original  equation  will  be  used  in 
this  study  to  coincide  with  the  time-frame  of  the  Project  A  concurrent  validation  and 
"McLaughlin"  data  collections. 

3.  Assignment  Simulation 

Entities  are  optimally  assigned  to  job  families  through  the  use  of  a  hybrid  adaptation  of 
a  primal  linear  programming  (simplex)  algorithm.  The  optimal  assignment  procedure  is  a 
modified  personal  computer  version  of  the  "NETG"  mathematical  programming  system 
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(published  by  Analysis,  Research,  and  Computation,  Inc.,  Austin,  Texas).  Tt  is  implemented 
through  the  use  of  a  circularized  network  optimization  model.  In  this  assignment  algorithm, 
quotas  are  met  at  each  iteration  while  the  allocation  sum  converges  toward  the  final  optimal 
solution.  At  the  final  iteration  the  objective  function,  MPP,  based  on  evaluation  sample  weights, 
is  maximized. 

For  this  study,  entities  are  optimally  assigned  to  job  families  not  individual  jobs.  Once 
assignment  to  a  job  family  has  occurred  the  entities  are  randomly  assigned  to  the  jobs  within  that 
job  family  as  a  second  assignment  stage.  The  quotas  for  job  families  are  set  so  that  there  will 
be  equal  quotas  for  each  job  in  every  condition.  For  Design  A,  since  there  were  18  jobs,  equal 
quotas  of  11  entities  per  job  were  set  so  that  all  198  selected  entities  were  assigned.  For  Design 
B,  since  there  were  60  jobs,  equal  quotas  of  5  entities  per  job  were  set  so  that  all  300  selected 
entities  were  assigned,  note  that  the  actual  number  of  entities  assigned  to  each  job  family  as 
a  constraint  of  the  optimal  assignment  algorithm  will  differ  depending  upon  how  many  jobs  are 
in  that  family. 

For  all  of  the  designs  in  this  reseaich  except  Design  A-3,  the  assignment  variables  are 
FLS  composites.  For  Design  A-3,  U.S.  Army  aptitude  area  composites  were  calculated  for  use 
as  assignment  variables.  After  assignment,  regression  weights  derived  from  the  designated 
population  (the  evaluation  weights)  are  used  to  calculate  the  MPP  of  entities  in  each  job. 
Assignment  is  made  by  job  family,  but  evaluation  using  MPP  based  on  weights  obtained  from 
the  population,  is  accomplished  separately  for  each  job. 
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HI.  EXPECTED  FINDINGS  AND  ACTUAL  RESULTS 


A  MPP  standard  score  represents  the  average  of  expected  performance  for  a  sample  of 
entities  on  the  jobs  to  which  each  is  assigned.  The  procedure  described  above  produced  20  MPP 
scores  for  each  condition  in  Designs  A  and  B.  The  expected  findings  for  Designs  A  and  B  are 
described  separately  below. 

Design  A 

1.  Number  of  Job  Families 

a.  The  magnitude  of  the  MPP  scores  will  increase  significantly  as  the  number  of  job 
families  is  increased  from  6  to  9  and  then  to  12  job  families. 

b.  The  efficiency  of  classification  varies  with  the  number  of  job  families  according  to 
a  negatively  accelerated  function  such  that  the  increase  in  MPP  from  6  to  9  job  families  will  be 

*  greater  than  the  increase  in  MPP  from  9  to  12  job  families. 

2.  Job  Clustering  Methods 

a.  The  CE  clustering  method  will  result  in  significantly  greater  MPP  scores  than  the  SE 
clustering  method  across  all  conditions. 

b.  The  empirical  methods  of  job  clustering  (CE  and  SE)  will  result  in  significantly 
greater  MPP  scores  than  the  current  U.S.  Army  operational  job  families. 

c.  Clustering  based  on  all  29  Project  A  tests  will  provide  significantly  greater  MPP  than 
clustering  based  on  the  standard  9  ASVAB  tests. 

3.  Type  of  Predictor  Measure 

When  the  assignment  variables  are  based  on  all  29  Project  A  tests,  MPP  will  be 
significantly  greater  than  when  the  assignment  variables  are  based  on  the  standard  9  ASVAB 
tests. 

4.  Type  of  Criterion  Measure 

a.  There  will  be  no  significant  difference  in  MPP  scores  due  to  the  use  of  assignment 
variables  based  on  the  Project  A  concurrent  validation  criterion  measure,  Core  Technical 
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Proficiency,  and  the  "McLaughlin"  1981-1982  criterion  measures,  Skill  Qualification  Tests 
(SQTs)  and  end-of-course  training  scores. 

b.  The  conclusions  from  statistical  significance  tests  between  the  levels  of  all  of  the  other 
variables  in  this  design  will  be  the  same  when  either  set  of  data  (differing  with  respect  to  the  two 
kinds  of  criteria)  is  used  to:  (1)  select  job  family  sets,  and  (2)  compute  weights  to  be  applied 
to  assignment  variables.  If  the  conclusions  reached  using  the  two  different  criteria  for 
hypotheses  1  and  2  are  the  same,  then  this  hypothesis  is  accepted.  If  any  significance  test  has 
different  results  for  the  two  criteria,  the  p-values  will  be  examined  to  determine  if  they  are 
within  a  designated  range  (i.e.,  .10)  indicating  that  for  practical  purposes  conclusions  using  the 
two  criteria  are  essentially  the  same. 

5.  FLS  Composites  versus  Aptitude  Area  Composites 

Any  condition  involving  FLS  assignment  will  result  in  significantly  greater  MPP  scores 
than  assignment  based  upon  the  Army  operational  aptitude  area  composites. 

Design  B 

1.  Number  of  Job  Families 

a.  The  magnitude  of  the  MPP  scores  will  increase  significantly  as  the  number  of  job 
families  is  increased  from  9  to  16  and  then  to  23  job  families. 

b.  The  efficiency  of  classification  varies  with  the  number  of  job  families  according  to 
a  negatively  accelerated  function  such  that  the  increase  in  MPP  from  9  to  16  job  families  will 
be  greater  than  the  increase  in  MPP  from  16  to  23  job  families. 

2.  Job  Clustering  Methods 

a.  The  empirical  methods  of  job  clustering  (CE  and  SE)  will  result  in  significantly 
greater  MPP  scores  than  the  current  U.S.  Army  aptitude  area  job  families,  the  Career 
Management  Field  (CMF)  categories,  or  a  combination  of  these  two  operational  groupings. 


A.  Job  Clustering  Results 

The  classification-efficient  (CE)  clustering  method  was  successful  in  providing  job 
clusters  that  minimized  the  reduction  in  Horst’s  differential  index,  Hd,  during  each  iteration. 
The  selection-efficient  (SE)  clustering  method,  however,  was  not  successful  in  creating  job 
clusters  that  maximized  predictive  validity.  Nevertheless,  it  was  discovered  that  the  CE  clusters 
actually  had  very  high  average  weighted  R2  values  even  though  these  jobs  were  clustered  based 
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on  Hd.  Explanations  for  the  lack  of  success  with  the  SE  method  as  well  as  further  implications 
are  discussed  below.  Also  discussed  below  is  a  comparison  of  the  operational  job  families  to 
the  empirical  job  families  in  terms  of  several  key  indices. 

1.  Classification-Efficient  Job  Families 

The  classification-efficient  clustering  method  resulted  in  sets  of  6,  9,  and  12  job  families 
for  Design  A  and  sets  of  9,  16,  and  23  job  families  for  Design  B.  For  Design  A,  18  jobs  were 
clustered  into  job  families  using  three  different  data  sources  (Project  A-Experi mental  Battery; 
Project  A--ASVAB;  and  "McLaughlin").  Tables  8,  9,  and  10  present  the  classification-efficient 
job  families  for  each  data  source  in  Design  A.  For  Design  B,  60  jobs  from  the  "McLaughlin" 
database  were  clustered  into  job  families.  Tables  11,  12,  and  13  present  the  classification- 
efficient  job  families  for  Design  B. 

Corresponding  to  each  of  these  sets  of  job  families  is  a  value  for  Horst’s  average 
differential  index,  H^/m,  where  m  is  the  number  of  assignment  variables  (i.e.,  job  families). 
Table  14  presents  the  Hd/m  indices  for  all  conditions.  As  expected,  there  were  increases  in  the 
Hd  indices  as  the  number  of  job  families  increased  from  6  to  9  to  12,  and  from  9  to  16  to  23. 
This  finding  gives  an  indication  of  the  amount  of  differential  validity  gained  in  the  "back" 
sample  as  the  number  of  job  families  is  increased. 

Johnson  and  Zeidner  (1990,  1991)  criticized  the  use  of  a  version  of  Hj  proposed  by 
McLaughlin  et  al.  (1984)  as  a  measure  of  CE.  McLaughlin  et  al.  (1984)  proposed  use  of  an 
index,  M,  as  a  measure  of  CE  when  the  assignment  variables  are  not  FLS  composites.  The  use 
of  M  cannot  be  justified  as  comparable  to  Horst’s  index  of  differential  validity  (Hi)  and  was 
inappropriate  for  use  as  a  measure  of  CE.  However,  McLaughlin  et  al.  (1984)  proposed  a 
baseline  measure,  H,  which  appears  to  be  proportional  to  Hj  (H2  =  Hd/m).  The  index  H  would 
be  proportional  to  Hd  across  data  sets.  The  use  of  Hj  in  the  "back"  (analysis)  sample  as  an 
approximation  of  CE  is  comparable  to  the  use  of  H,  but  not  M,  in  McLaughlin  et  al.  (1984)  as 
a  measure  of  CE. 

Also  included  in  Table  14  is  a  ceiling  average  Hd  value  which  is  the  maximum  amount 
of  Hj/m  possible  in  a  set  of  data  determined  by  calculating  Hd  for  all  jobs  prior  to  clustering. 
The  purpose  of  the  CE  clustering  method  was  to  minimize  the  reduction  in  H,  as  jobs  are 
formed  into  job  families.  Comparisons  with  the  ceiling  values  give  an  indication  of  the 
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Table  8 


Design  A  Classification-Efficient  Job  Families  for  Project  A  ('Experimental  Battery! 
Data  Source 


6  Job  Families 


9  Job  Families 


12  Job  Families 


Job  Family  1 
Infantryman 
NBC  Specialist 
Petroleum  Supply  Specialist 
Combat  Engineer 
M48-M60  Armor  Crewmember 

Job  Fami ly  2 
Cannon  Crewmember 
Light  Uheel  Vehicle  Mechanic 
Medical  Specialist 
Motor  Transport  Operator 
Military  Police 
Utility  Helicopter  Repairer 

Job  family  3 
MANPAOS  Crewmember 
Adninistrative  Specialist 
Food  Service  Specialist 
Unit  Supply  Specialist 

Job  Fami ly  A 
TOW/DRAGON  Repairer 

Job  Fami ly  5 

Single  Channel  Radio  Operator 

Job  Fami ly  6 
Ammunition  Specialist 


Job  Fami Iv  1 
Infantryman 
NBC  Specialist 
Petroleun  Supply  Specialist 
Combat  Engineer 
M48-M60  Armor  Crewmember 

Job  Family  2 
Cannon  Crewmember 
Light  Uheel  Vehicle  Mechanic 
Medical  Specialist 


Job  Family  3 
MANPAOS  Crewmember 


Job  Family  4 
TOU/DRAGON  Repairer 

Job  Family  S 

Single  Channel  Radio  Operator 

Job  Family  6 
Ammunition  Specialist 

Job  Family  7 

Motor  Transport  Operator 

Military  Police 

Utility  Helicopter  Repairer 

Job  Fami ly  8 

Adninistrative  Specialist 
Food  Service  Specialist 

Job  Family  9 
Unit  Supply  Specialist 


Job  Family  1 
infantryman 
NBC  Specialist 
Petroleum  Supply  Specialist 


Job  Family  2 
Combat  Engineer 
M48-M60  Armor  Crewmeober 


Job  Family  3 
Cannon  Crewmenber 


Job  Fami ly  4 
MANPAOS  Crewmember 

Job  Family  5 
TOU/DRAGON  Repairer 

Job  Family  6 

Single  Channel  Radio  Operator 

Job  Family  7 
Ammunition  Specialist 


Job  Family  8 

Light  Uheel  Vehicle  Mechanic 
Medical  Specialist 

Job  Fami ly  9 
Motor  Transport  Operator 
Military  Police 

Job  Family  10 

Utility  Helicopter  Repairer 

Job  Fami ly  11 
Adninistrative  Specialist 
Food  Service  Specialist 

Job  Family  12 
Unit  Supply  Specialist 
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Table  9 


Design  A  Classification-Efficient  Job  Families  for  Project  A  (ASVAB  tests) 
Data  Source 


6  Job  Families 


Job  family  1 
Infantryman 

Unit  Supply  Specialist 
NBC  Specialist 

Single  Channel  Radio  Operator 

Job  Fami Iv  2 
Combat  Engineer 
M4B-M60  Armor  crewmember 
TOU /DRAGON  Repairer 

Job  family  3 
Cannon  Crewmember 
Light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 
Motor  Transport  Operator 

Job  Family  4 
MANPADS  Crewmember 
Medical  Specialist 
Military  Police 

Job  Family  5 
Ammunition  Specialist 
Administrative  Specialist 
Food  Service  Specialist 

Job  Fami Iv  6 

Petroleun  Supply  Specialist 


Q  Job  Families 


Job  Family  1 
Infantryman 

Unit  Supply  Specialist 
NBC  Specialist 


Job  Fami ly  2 
Combat  Engineer 
M4B-M60  Armor  Crewmember 
TCW /DRAGON  Repairer 

Job  Family  3 
Cannon  Crewmember 
Light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 


Job  Fami Iv  4 
MANPADS  Crewmember 
Medical  Specialist 
Military  Police 

Job  Family  5 

Single  Channel  Radio  Operator 


Job  Fami Iv  6 
Ammunition  Specialist 

Job  Family  7 

Motor  Transport  Operator 
Job  Fami ly  8 

Administrative  Specialist 
Food  Service  Specialist 

Job  Fami  Iv  9 

Petroleun  Supply  Specialist 


12  Job  Families 


Job  Family  1 
Infantryman 

Unit  Supply  Specialist 
NBC  Specialist 


Job  Family  2 
Combat  Engineer 
M48-M60  Armor  Crewmember 


Job  Family  3 
Cannon  Creunember 
Light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 


Job  Family  4 
MANPADS  Crewmember 
Medical  Specialist 


Job  Family  5 
TOW/DRAGON  Repairer 


Job  Family  6 

Single  Channel  Radio  Operator 

Job  Family  7 
Ammunition  Specialist 

Job  Family  8 
Motor  Transport  Operator 


Job  Family  9 

Adninistrative  Specialist 
Job  Family  10 

Petroleun  Supply  Specialist 

Job  Fami Iv  11 
Food  Service  Specialist 

Job  Fami Iv  12 
Military  Police 
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Table  10 


Design  A  Classification-Efficient  Job  Families  for  McLaughlin  Data  Source 


6  Job  Families 


Job  Family  1 

Single  Channel  Radio  Operator 
Hotor  Transport  Operator 
M48-M60  Armor  Crewmember 
Food  Service  Specialist 
Military  Police 
Infantryman 
Cannon  Crewmember 

Job  Family  2 
Conbat  Engineer 
Light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 
Petroleun  Supply  Specialist 
MANPADS  Crewmember 

Job  Family  3 
TOU /DRAGON  Repairer 


Job  Family  4 
NBC  Specialist 

Job  Fami Iv  5 
Ammunition  Specialist 
Unit  Supply  Specialist 
Adninistrative  Specialist 

Job  Fami lv  6 
Medical  Specialist 


9  Job  Families 


Job  Fami lv  1 

Single  Channel  Radio  Operator 
Motor  Transport  Operator 
M4S-M60  Armor  Crewmember 
Food  Service  Specialist 
Military  Police 


Job  Family  2 
Infantryman 
Carrion  Crewmember 


Job  Fami lv  3 
Combat  Engineer 
Light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 
Petroleum  Supply  Specialist 

Job  Family  4 
MANPADS  Crewmember 

Job  Family  5 
TOW/ORAGON  Repairer 


Job  Fami lv  6 
NBC  Specialist 

Job  Fami lv  7 
Anmunition  Specialist 
Unit  Supply  Specialist 

Job  Fami lv  8 

Adninistrative  Specialist 


Job  Fami lv  9 
Medical  Specialist 


12  Job  Families 


Job  Family  I 

Single  Channel  Radio  Operator 
Motor  Transport  Operator 


Job  Family  2 
Infantryman 
Cannon  Crewmember 


Job  Family  3 
Combat  Engineer 
Light  Wheel  Vehicle  Mechanic 


Job  Family  4 
MANPADS  Crewmember 

Job  Family  5 

M48-M60  Armor  Crewmember 
Food  Service  Specialist 
Military  Police 

Job  Fami  lv  6 
TOW/DRAGON  Repairer 

Job  Family  7 
NBC  Specialist 


Job  Family  8 
Ammunition  Specialist 
Unit  Supply  Specialist 

Job  Family  9 

Utility  Helicopter  Repairer 

Job  Family  10 
Adninistrative  Specialist 

Job  Fami lv  11 

Petroleun  Supply  Specialist 

Job  Fami lv  12 
Medical  Specialist 
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Table  1 1 


Nine  Classification-Efficient  Job  Families  for  Design  B 

JOB  JOB 


FAMILY 

MOS 

n 

FAMILY 

MOS 

n 

1 

05C 

Radio  TT  Operator  * 

2393 

4 

12F  Engineer  Tracked  Crmn 

151 

1 

1  IB 

Infantryman  * 

6355 

4 

36C  Wire  Sys  Inst/Op 

499 

1 

11C 

Indirect  Fire  lnfmn 

1494 

4 

63N  M60A1/A3  Tank  Sys  Mech 

286 

1 

12B 

Combat  Engineer  * 

3109 

4 

63W  Wheel  Veh  Mechanic 

180 

1 

12C 

Bridge  Crewman 

450 

4 

68J  Aircraft  FC  Repairer 

148 

1 

13B 

Cannon  Crmn  * 

6575 

4 

91B  Medical  Specialist  * 

783 

1 

13F 

Fire  Si^jport  Sp 

693 

1 

150 

Lance  Crmb/MLRS  Sgt 

281 

5 

27F  Vulcan  Repairer 

130 

1 

16S 

MANPADS  Crewmember  * 

596 

5 

52D  Power  Generation  Equip  Rep 

178 

1 

190 

Cavalry  Scout 

1249 

5 

54E  NBC  Specialist  * 

113 

1 

19E 

K48-M60  Armor  Crmn  * 

3297 

1 

31M 

Multichannel  Comm  Eq  Op 

2482 

6 

43E  Parachute  Rigger 

100 

1 

31V 

Tac  Coma  Sysop/Hech 

515 

6 

68G  Aircraft  Structural  Rep 

125 

1 

62B 

Construction  Equip  Rep 

233 

6 

76C  Eq  Rec  &  Parts  Sp 

331 

1 

62E 

HV  Const  Equip  Rep 

202 

1 

638 

Lt  Wh  Veh/Pwr  Gen  Mech  * 

1495 

7 

57N  Cargo  Specialist 

272 

1 

63H 

Track  Veh  Repairer 

335 

7 

71N  Traffic  Hgmt  Coordinator 

163 

1 

64C 

Motor  Transport  Op  * 

3681 

1 

67N 

Utility  Hel  Repairer* 

511 

8 

62F  Lifting/Loading  Eq  Op 

129 

1 

67V 

OBN/Scout  Hel  Rep 

294 

8 

71H  Chapel  Activities  Sp 

182 

1 

72E 

Combat  Telecom  Center  Op 

569 

8 

74F  Programner/Analyst 

95 

1 

76U 

Petroleum  Supply  Sp  * 

664 

1 

82C 

Field  Artillery  Surveyor 

434 

9 

71L  Adninistrative  Sp  * 

2824 

1 

94  B 

Food  Service  Sp  * 

3943 

9 

73C  Finance  Specialist 

688 

1 

95B 

Military  Police  * 

4516 

9 

74D  Computer/Tape  Writer 

132 

. 

9 

75B  Personnel  Arinin  Sp 

1061 

2 

OSH 

Elec  Uar/SIGINT  INTERJMC 

171 

9 

76V  Mat  Stor  &  Hdlg  Sp 

216 

2 

98C  Elec  Uar/SIGINT  Analyst 

186 

9 

93H  Air  Traffic  Con  Tower  Op 

114 

3 

1 1 H 

HV  Anti -Armor  Upn  Infn 

979 

9 

96B  Intelligence  Analyst 

218 

3 

13E 

Cannon  Fire  Direction  Sp 

627 

*  =  MOS  for  Design  A 

3 

27E  TOU/Dragon  Rep  * 

363 

3 

31N  Tactical  Ckt  Con 

189 

3 

55B  Ammunition  Sp  * 

288 

3 

76Y  Unit  Supply  Sp  * 

1149 

3 

91E  Dental  Specialist 

203 

3 

91P  X-Ray  Specialist 

159 

3 

92B  Medical  Lab  Sp 

310 

(continued) 
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Table  12 


Classification-Efficient  Job  Families  for  Design  B 


J08 


FAMILY 

HOS 

n 

1 

05C  Radio  TT  Operator  * 

2393 

1 

1 IS  Infantryman  * 

63SS 

1 

11C  Indirect  Fire  Infmn 

1494 

1 

12C  Bridge  Crewman 

450 

1 

13B  Cannon  Crmn  * 

6575 

1 

13F  Fire  Support  Sp 

693 

1 

190  Cavalry  Scout 

1249 

1 

19C  H48-K60  Armor  Crmn  * 

3297 

1 

31V  Tac  Comm  Sysop/Mech 

515 

1 

62B  Construction  Equip  Rep 

233 

1 

62E  HV  Const  Equip  Rep 

202 

1 

63H  Track  Veh  Repairer 

335 

1 

64C  Motor  Transport  Op  * 

3681 

1 

67V  OBM/Scout  Hel  Rep 

294 

1 

82C  Field  Artillery  Surveyor 

434 

1 

94B  Food  Service  Sp  * 

3943 

1 

95B  Military  Police  * 

4516 

2 

OSH  Elec  Uar/SIGINT  INTERJMC 

171 

2 

98C  Elec  Uar/SIGINT  Analyst 

186 

3 

11H  HV  Anti -Armor  Upn  Infn 

979 

3 

13E  Cannon  Fire  Direction  Sp 

627 

3 

55B  Annunition  Sp  * 

288 

3 

76Y  Unit  Supply  Sp  * 

1149 

3 

91E  Dental  Specialist 

203 

4 

12B  Combat  Engineer  * 

3109 

4 

15D  Lance  Crmb/MLRS  Sgt 

281 

4 

63B  Lt  Uh  Veh/Pwr  Gen  Mech  * 

1495 

4 

67N  Utility  Hel  Repairer* 

511 

4 

76U  Petroleun  Supply  Sp  * 

664 

5 

12F  Engineer  Tracked  Crmn 

151 

5 

63U  Wheel  Veh  Mechanic 

180 

6 

16S  MANPADS  Crewmenber  * 

596 

6 

31M  Multichannel  Comm  Eq  Op 

2482 

6 

72E  Combat  Telecom  Center  Op 

569 

7 

27E  TOU/Dragon  Rep  * 

363 

7 

92B  Medical  Lab  Sp 

310 

(continued) 


JOB 


FAMILY 

HOS 

n 

8 

27F  Vulcan  Repairer 

130 

8 

52D  Power  Generation  Equip  Rep 

178 

8 

54E  NBC  Specialist  * 

113 

9 

31N  Tactical  Ckt  Con 

189 

9 

91P  X-Ray  Specialist 

159 

10 

36C  Wire  Sys  Inst/Op 

499 

10 

63N  M60A1/A3  Tank  Sys  Mech 

286 

10 

68J  Aircraft  FC  Repairer 

148 

10 

91B  Medical  Specialist  * 

783 

11 

43E  Parachute  Rigger 

100 

12 

57H  Cargo  Specialist 

272 

12 

71N  Traffic  Mgmt  Coordinator 

163 

13 

62F  Lifting/Loading  Eq  Op 

129 

13 

71M  Chapel  Activities  Sp 

182 

14 

68G  Aircraft  Structural  Rep 

125 

14 

76C  Eq  Rec  l  Parts  Sp 

331 

15 

71L  Administrative  Sp  * 

2824 

15 

73C  Finance  Specialist 

688 

15 

74D  Computer/Tape  Writer 

132 

15 

75B  Personnel  Admin  Sp 

1061 

15 

76V  Mat  Stor  &  Hdlg  Sp 

216 

15 

93H  Air  Traffic  Con  Tower  Op 

114 

15 

96B  Intelligence  Analyst 

218 

16 

74F  Programmer/Analyst 

95 

*  =  HOS  for  Design  A 
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Table  13 


JOB 


FAMILY 

MOS 

n 

1 

05C  Radio  TT  Operator  * 

2393 

1 

118  Infantryman  * 

6355 

1 

11C  Indirect  Fire  Infmn 

1494 

1 

13B  Cannon  Crmn  * 

6575 

1 

190  Cavalry  Scout 

1249 

1 

31V  Tac  Coma  Sysop/Mech 

515 

1 

63H  Track  Veh  Repairer 

335 

1 

64C  Motor  Transport  Op  * 

3681 

1 

94B  Food  Service  Sp  * 

3943 

1 

95B  Military  Police  * 

4516 

2 

OSH  Elec  Uar/SIGINT  INTERJHC 

171 

2 

98C  Elec  Uar/SIGINT  Analyst 

186 

3 

11H  HV  Anti -Armor  Upn  Infn 

979 

3 

13E  Cannon  Fire  Direction  Sp 

627 

3 

S5B  Ammunition  Sp  * 

288 

3 

76Y  Unit  Supply  Sp  * 

1149 

3 

91E  Dental  Specialist 

203 

4 

12B  Combat  Engineer  * 

3109 

4 

15D  Lance  Crmb/MLES  Sgt 

281 

4 

638  Lt  Uh  Veh/Pwr  Gen  Mech  * 

1495 

4 

67N  Utility  Hel  Repairer  * 

511 

4 

76U  Petroleum  Supply  Sp  * 

664 

5 

12C  Bridge  Crewman 

450 

5 

13F  Fire  Support  Sp 

693 

5 

19E  M48-H60  Armor  Crmn  * 

3297 

5 

62B  Construction  Equip  Rep 

233 

5 

62E  HV  Const  Equip  Rep 

202 

5 

67V  OBN/Scout  Hel  Rep 

294 

5 

82C  Field  Artillery  Surveyor 

434 

6 

12F  Engineer  Tracked  Crmn 

151 

7 

16S  MANPADS  Crewmember  * 

596 

7 

31M  Multichannel  Coma  Eq  Op 

248c 

7 

72E  Combat  Telecom  Center  Op 

569 

8 

27E  TOU/Dragon  Rep  * 

363 

(continued) 


J08 


FAMILY 

MOS 

n 

9 

27F  Vulcan  Repairer 

130 

9 

52D  Power  Generation  Equip  Rep 

178 

10 

31N  Tactical  Ckt  Con 

189 

10 

91P  X-Ray  Specialist 

159 

11 

36C  Wire  Sys  Inst/Op 

499 

11 

68J  Aircraft  FC  Repairer 

148 

11 

91B  Medical  Specialist  * 

783 

12 

43E  Parachute  Rigger 

100 

13 

54E  NBC  Specialist  * 

113 

14 

57H  Cargo  Specialist 

272 

14 

71N  Traffic  Mgmt  Coordinator 

163 

15 

62F  Lifting/Loading  Eq  Op 

129 

15 

71M  Chapel  Activities  Sp 

182 

16 

63N  H60A1/A3  Tank  Sys  Mech 

286 

17 

63U  Wheel  Veh  Mechanic 

180 

18 

68G  Aircraft  Structural  Rep 

125 

18 

76C  Eq  Rec  &  Parts  Sp 

331 

19 

71L  Adninistrative  Sp  * 

2824 

19 

73C  Finance  Specialist 

688 

19 

75B  Personnel  Admin  Sp 

1061 

20 

740  Computer/Tape  Writer 

132 

20 

96B  Intelligence  Analyst 

218 

21 

74F  Programner/Analyst 

95 

22 

76V  Mat  Stor  &  Hdlg  Sp 

216 

22 

93H  Air  Traffic  Con  Tower  Op 

114 

23 

92B  Medical  Lab  Sp 

310 

*  =  MOS  for  Design  A 
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Table  14 


Comparison  of  H^/m8  Indices  Across  Conditions 


DESIGN  A 

Project  A 

Exp.  Batt. 

Project  A 
ASVAB 

McLaughlin 

ASVAB 

Ceiling  Hd/mb 

3.340 

1.377 

0.759 

12  CE  Job  Families 

2.712 

1.225 

0.728 

9  CE  Job  Families 

2.264 

1.074 

0.683 

6  CE  Job  Families 

1.662 

0.860 

0.602 

9  Operational 

Job  Families 

1.963 

0.805 

— 

DESIGN  B 

CE 

Method 

Operational 

Method 

Ceiling  Hd/mb 

4.490 

4.490 

23  Job  Families 

3.893 

2.730 

16  Job  Families 

3.545 

2.350 

9  Job  Families 

3.025 

0.924 

“The  value  m  represents  the  number  of  assignment  variables  (i.e.,  job  families). 
bCeiling  values  were  calculated  using  all  jobs  individually  before  grouping  into  job 
families. 
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reduction  in  Hd  expected  when  grouping  jobs  into  job  families.  For  example,  for  the  Project  A 
(Experimental  Battery)  condition  in  Design  A,  grouping  18  jobs  into  9  job  families  using  the  CE 
method  resulted  in  a  one-third  decrease  in  average  H^.  For  Design  B,  grouping  60  jobs  into  23 
job  families  using  the  CE  method  resulted  in  a  one-sixth  decrease  in  average  H*. 

Table  14  also  contains  the  average  Hd  indices  (H^m)  for  the  operational  job  families  used 
in  this  research.  For  Design  A,  18  jobs  were  grouped  into  the  nine  operational  aptitude  areas 
currently  used  by  the  Army  and  Hj  values  were  calculated  using  both  the  Proj.A  (Exp.Batt.)  and 
Proj.A  (ASVAB)  data  sources.  Note  that  the  average  Hd  values  for  the  nine  operational  job 
families  were  substantially  lower  than  the  values  for  the  nine  CE  job  families  for  both  data 
sources.  For  the  Proj.A  (ASVAB)  data  source,  the  average  Hd  value  for  the  nine  operational 
job  families  was  even  lower  than  for  the  six  CE  job  family  condition.  For  Design  B,  the 
average  Hd  values  for  the  operational  job  families  were  also  substantially  lower  than  for  the  CE 
job  families  across  all  conditions. 

Overall,  comparisons  of  these  average  H*  results  in  Table  14  give  preliminary  but 
inconclusive  evidence  for  many  of  the  expected  findings  presented  earlier.  This  evidence  is 
inconclusive  because,  although  it  has  been  shown  that  IL  can  be  linked  to  MPP  through  its 
relationship  to  Brogden’s  measure  of  classification  efficiency,  this  is  a  "back"  sample 
relationship.  Only  by  conducting  simulations  based  upon  the  real  data,  generating  MPP  score, 
and  statistically  analyzing  these  scores  can  the  evidence  be  conclusively  presented  in  independent 
"cross"  samples. 

For  Design  A,  examination  of  the  jobs  within  each  of  the  job  families  across  all 
conditions  shows  that  the  CE  clustering  technique  resulted  in  job  families  of  varying  sizes 
ranging  anywhere  from  1  job  to  7  jobs.  Across  the  sources  of  data,  the  Proj.A  (Exp.Batt.) 
condition  resulted  in  job  families  that  shared  some  similiarities  to  the  Proj.A  (ASVAB) 
condition.  For  example,  Infantryman  and  NBC  Specialist  clustered  in  the  same  family  for  both 
conditions  and  Combat  Engineer  and  M48-M60  Armor  Crewmember  clustered  together  for  both 
conditions.  The  "McLaughlin"  data  resulted  in  clusters  that  appeared  to  have  little  in  common 
with  either  of  the  Project  A  concurrent  validation  data  set  clusters. 

For  Design  B,  the  classification-efficient  job  families  formed  ranged  in  size  from  1  job 
to  25  jobs.  Comparison  of  these  job  families  with  the  operational  job  families  presented  earlier 


(see  Tables  5,  6,  and  7),  revealed  little  similarity  between  in  the  two  groupings.  The  operational 
job  families  contained  generally  logical  groupings  of  similar  types  of  jobs.  The  CE  job  families 
often  grouped  together  more  diverse  types  of  jobs.  The  MOS  in  a  priori  job  families  are,  of 
course,  in  a  particular  family  because  they  appear  to  belong  to  that  family.  Thus,  their  higher 
face  validity  for  membership  in  their  family  should  not  be  surprising.  Across  all  CE  conditions, 
the  first  job  family  was  always  the  largest,  with  the  first  job  family  gradually  getting  larger  as 
the  number  of  formed  clusters  decreased  from  23  to  16  to  9.  This  first  job  family  was  generally 
composed  of  combat  jobs,  mechanic  and  repair  jobs,  and  various  other  specialist  jobs.  The 
diversity  of  jobs  in  the  CE  job  families  indicates  that  it  would  not  be  easy  to  form  classification- 
efficient  clusters  a  priori. 

There  are  two  other  indices  that  are  important  for  evaluating  the  classification-efficient 
clusters  and  comparing  them  to  the  operational  clusters.  The  first  is  the  predictive  validity  or 
average  weighted  R2  value  across  job  family  conditions,  and  the  second  is  the  average 
intercorrelation  among  the  LSEs  (r). 

Table  15  provides  a  comparison  of  the  average  weighted  R2  values  for  all  conditions. 
The  average  weighted  R2  value  for  each  condition  was  calculated  by  weighting  the  R2  values  for 
each  job  family  by  the  number  of  jobs  in  that  job  family,  adding  these  values,  and  dividing  by 
the  total  number  of  jobs.  As  with  the  H*  index,  there  was  a  maximum  amount  of  R2,  calculated 
using  each  job  individually,  that  acts  as  a  ceiling  for  the  highest  average  R2  values  possible  for 
a  given  condition.  From  Table  15,  note  that  even  though  the  CE  job  families  were  formed  based 
on  Hd  their  average  weighted  R2  values  are  fairly  high  in  comparison  to  the  ceiling  values. 
Naturally,  the  12  cluster  and  23  cluster  conditions  were  closest  to  the  ceiling  values  calculated 
using  all  jobs  (either  18  or  60  jobs).  However,  grouping  the  jobs  into  smaller  sets  of  job 
families  for  Designs  A  and  B  also  did  not  result  in  large  decrements  in  R2.  This  finding  appears 
to  be  supportive  of  validity  generalization  (VG)  theory  and  may  be  another  excellent  example 
of  how  DAT  and  VG  are  not  necessarily  inconsistent. 

From  Table  15,  note  also  that  the  average  weighted  R2  values  for  the  operational  job 
families  are  only  slightly  lower  than  the  CE  job  families  for  both  Designs  A  and  B.  Recall  that 
there  did  appear  to  be  fairly  substantial  differences  between  the  operational  job  families  and  the 
CE  job  families  in  terms  of  the  Hj  index  (see  Table  14).  Thus,  it  appears  that  the  CE  empirical 
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Table  15 


Comparison  of  Average  Weighted  R2  Indices  Across  Conditions 


DESIGN  A 

Project  A 

Exp.  Batt. 

Project  A 
ASVAB 

McLaughlin 

ASVAB 

Ceiling  R2* 

.589 

.437 

.342 

12  CE  Job  Families 

.554 

.428 

.340 

9  CE  Job  Families 

.529 

.419 

.338 

6  CE  Job  Families 

.496 

.407 

.333 

9  Operational 

Job  Families 

.508 

.404 

— 

DESIGN  B 

CE 

Method 

Operational 

Method 

Ceiling  R2* 

.374 

.374 

23  Job  Families 

.364 

.343 

16  Job  Families 

.358 

.336 

9  Job  Families 

.349 

.313 

“Ceiling  values  were  calculated  as  the  average  of  the  R2  values  computed  separately 
for  each  job. 
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clustering  method  provided  a  meaningful  improvement  in  Hd  compared  to  the  operational  system 
with  the  predictive  validity  (R2)  of  both  methods  remaining  virtually  identical. 

The  last  important  index  that  can  be  examined  is  the  intercorrelations  among  the  LSEs, 
r.  Increasing  the  number  of  job  families  in  a  classification-efficient  manner  should  decrease  the 
intercorrelation  among  the  LSEs.  This  effect  occurs  because  each  increase  in  the  number  of  job 
families  should  provide  greater  uniqueness  for  the  job  families.  Table  16  shows  the  average 
intercorrelations  among  the  LSEs  for  the  different  job  family  conditions.  As  expected,  for  both 
of  the  Project  A  concurrent  validation  data  sources  in  Design  A,  the  average  intercorrelation 
decreased  steadily  as  the  number  of  job  families  increased  from  6  to  9  to  12.  Unexpectedly,  this 
relationship  did  not  hold  for  the  "McLaughlin"  data  source  in  Design  A.  In  fact,  the  average 
intercorrelation  for  "McLaughlin"  increased  as  the  number  of  job  families  increased.  However, 
with  the  number  of  jobs  expanded  to  60  in  Design  B  for  the  "McLaughlin"  data  (see  Table  16), 
the  expected  relationship  of  a  decrease  in  the  average  intercorrelations  when  the  number  of  job 
families  increased  was  apparent.  From  Table  16,  also  note  that  the  average  intercorrelations 
among  the  LSEs  for  the  nine  operational  job  families  in  Design  A  fell  just  above  the  average  for 
the  nine  CE  job  families  for  both  data  sources.  Likewise,  the  average  intercorrelations  for  the 
operational  job  families  in  Design  B  were  greater  than  the  average  intercorrelations  for  the  CE 
job  families. 

2.  Selection-Efficient  Job  Families 

The  selection-efficient  (SE)  job  clustering  method  developed  for  this  research  did  not 
provide  a  solution  that  was  at  all  credible  in  approximating  the  maximization  of  predictive 
validity  (R2).  As  discussed  previously  in  the  method  section,  the  SE  method  developed  for  this 
research  was  a  two-stage  heuristic  approach  intended  to  provide  a  practical  method  of  obtaining 
a  close  approximation  to  the  highest  possible  predictive  validity  in  a  set  of  job  clusters  without 
having  to  evaluate  every  possible  combination  of  jobs  in  terms  of  predictive  validity. 

The  first  stage  of  the  algorithm  was  an  initial  combination  of  jobs  into  job  clusters.  For 
the  six  job  family  condition  (Design  A),  this  meant  clustering  the  18  jobs  into  six  sets  of  families 
each  containing  three  jobs.  Sets  of  jobs  were  selected  so  that  six  non-overlapping  families  were 
chosen  that  had  the  highest  R2  values.  The  R2  values  for  all  possible  sets  of  three  jobs  were 
computed  and  the  triplet  with  no  overlap  with  the  first  set  that  had  the  next  highest  R2  was  then 
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Table  16 


Comparison  of  the  Average  Intercorrelations  Among  the  LSEs 
for  all  Job  Family  Conditions 


DESIGN  A 


Project  A 
Exp.  Batt. 

Project  A 
ASVAB 

McLaughlin 

ASVAB 

12  CE  Job  Families 

.699 

.843 

.876 

9  CE  Job  Families 

.720 

.851 

.862 

6  CE  Job  Families 

.763 

.909 

.833 

9  Operational 

Job  Families 

.750 

.873 

— 

DESIGN  B 


CE 

Operational 

Method 

Method 

23  Job  Families 

.775 

.906 

16  Job  Families 

.805 

.907 

9  Job  Families 

.809 

.973 
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selected.  This  process  was  continued  until  all  jobs  were  placed  in  a  selected  triplet.  For  the  9 
and  12  job  family  conditions,  the  18  jobs  were  initially  grouped  into  sets  of  iwo  jobs  (for  the 
12  job  family  condition  most  families  had  two  jobs  but  a  few  had  only  one  job). 

The  second  stage  of  the  algorithm  involved  shredding  these  initial  clusters  to  determine 
if  there  were  other  combinations  of  jobs  that  had  a  higher  predictive  validity.  This  shredding 
process  had  a  constraint  that  once  jobs  were  added  to  an  evolving  cluster,  that  cluster  was  no 
longer  eligible  to  be  shredded.  Without  this  constraint,  the  algorithm  would  have  Leen  infeasible 
because  it  would  have  been  no  different  than  evaluating  all  possible  combinations  of  jobs. 

Tables  17  and  18  give  examples  of  six  job  families  formed  using  the  SE  method  for 
Proj.A  (Exp.Batt)  and  Proj.A  (ASVAB)  data  sources.  The  first  column  shows  the  initial  set  of 
clusters  from  the  first  stage  and  the  second  column  shows  the  final  set  of  clusters  after  the 
second  stage  shredding  process.  Although  the  initial  set  of  clusters  appeared  perfectly 
reasonable,  note  that  the  overall  average  weighted  R2  value  was  not  very  high.  In  fact, 
examination  of  Table  15  presented  earlier  for  the  CE  job  families  reveal  that  the  CE  job  families 
had  higher  R2  values  than  these  SE  job  families.  This  was  true  for  all  conditions  in  this 
experiment. 

It  became  apparent  after  examination  of  the  data,  that  forcing  the  jobs  into  initial  sets  of 
clusters  severely  restricted  the  R2  values.  Some  of  the  jobs  individually  had  very  high  R2  values, 
but  they  were  forced  together  with  one  or  two  other  jobs  by  the  nature  of  the  algorithm  thereby 
lowering  their  potential  contribution  to  R2.  The  second  stage  shredding  process  was  designed 
to  eliminate  these  problems  with  the  first  stage.  From  examination  of  Tables  17  and  18,  it  can 
be  seen  that  the  second  stage  process  provided  only  a  slight  increase  in  R2.  Once  again  it 
became  apparent  from  examination  of  the  data  that  the  jobs  which  individually  made  the  most 
contribution  to  R2  were  never  isolated  through  this  second  stage  of  the  algorithm.  Instead,  these 
jobs  had  other  jobs  combined  with  them,  and  due  to  the  constraints  of  the  algorithm  this  new 
job  cluster  was  no  longer  eligible  to  be  shredded. 

Several  attempts  were  made  to  develop  reasonable  alternative  ways  of  forming  selection- 
efficient  clusters.  A  modification  to  the  algorithm  was  attempted  that  shredded  jobs  beginning 
with  the  cluster  that  had  the  highest  R2  value  instead  of  the  lowest  R2  value.  This  was  an 
attempt  to  provide  an  opportunity  for  the  jobs  that  contributed  the  most  to  overall  R2  to  be 
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Table  17 


Six  Job  Families  for  Project  A  (Experimental  Battery!  Using  Selection-Efficient 


Clustering  Method 


Initial  Clusters 


R2 


Job  Family  1  0.711 

TOU/DRAGON  Repairer 
Petroleun  Supply  Specialist 
Food  Service  Specialist 


Job  Family  2  0.571 

Infantryman 

Combat  Engineer 

M48-M60  Armor  Crewmember 


Job  Family  3  0.515 

HANPADS  Crewmember 

Single  Channel  Radio  Operator 

NBC  Specialist 

Job  Family  4  0.443 

Ammunition  Specialist 
Administrative  Specialist 
Unit  Supply  Specialist 

Job  Family  5  0.381 

Light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 
Medical  Specialist 

Job  Family  6  0.246 

Cannon  Crewmember 
Motor  Transport  Operator 
Military  Police 

Average  Weighted  R2  = 
0.477893 


After  Shredding 

R2 


Job  Family  1  0.651 

TOU/DRAGON  Repairer 
Petroleun  Supply  Specialist 
Food  Service  Specialist 
Infantryman 

Job  Family  2  0.625 

Combat  Engineer 
K48-M60  Armor  Crewmember 


Job  Family  3  0.515 

MAN PADS  Crewmember 

Single  Channel  Radio  Operator 

NBC  Specialist 

Job  Family  4  0.443 

Anmunition  Specialist 
Administrative  Specialist 
Unit  Supply  Specialist 

Job  Family  5  0.381 

light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 
Medical  Specialist 

Job  Family  6  0.246 

Cannon  Crewmember 
Motor  Transport  Operator 
Military  Police 

Average  Weighted  R2  = 
0.478225 
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Table  18 


Six  Job  Families  for  Project  A  (ASVAB)  Data  Using  Selection-Efficient 
Clustering  Method 


Initial  Clusters 

R2 


Job  Family  1  0.629 

Contoat  Engineer 
TOW/DRAGON  Repairer 
Petroleum  Supply  Specialist 


Job  Family  2  0.493 

H48-H60  Armor  Crewmember 

NBC  Specialist 

Food  Service  Specialist 

Job  Family  3  0.416 

Infantryman 

Single  Channel  Radio  Operator 
Unit  Supply  Specialist 

Job  Family  4  0.361 

MANPADS  Crewmember 
Administrative  Specialist 
Medical  Specialist 

Job  Family  5  0.296 

Ammunition  Specialist 
Light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 

Job  Family  6  0.182 

Cannon  Crewmember 
Motor  Transport  Operator 
Military  Police 

Average  Weighted  R2  = 
0.396380 


After  Shredding 

R2 


Job  Family  1  0.536 

Combat  Engineer 
TOU/DRAGON  Repairer 
Petroleua  Supply  Specialist 
Infantryman 

M48-M60  Armor  Crewmember 
NBC  Specialist 

Job  Family  2  0.620 

Food  Service  Specialist 


Job  Family  3  0.418 

Single  Channel  Radio  Operator 
Unit  Supply  Specialist 


Job  Family  4  0.361 

MANPADS  Crewmember 
Administrative  Specialist 
Medical  Specialist 

Job  Family  5  0.296 

Aamunition  Specialist 
Light  Wheel  Vehicle  Mechanic 
Utility  Helicopter  Repairer 

Job  Family  6  0.182 

Cannon  Crewmember 
Motor  Transport  Operator 
Military  Police 

Average  Weighted  R2  = 
0.399455 
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isolated.  Although  some  different  rearrangements  of  the  jobs  resulted  from  this  procedure, 
average  weighted  R2  values  were  not  increased.  Another  attempt  was  made  to  apply  the  second 
stage  shredding  process  to  the  classification-efficient  clusters.  The  idea  was  to  try  to 
substantially  increase  the  R2  values  of  the  CE  clusters  by  rearranging  the  jobs  and  designate  this 
new  set  of  families  as  the  SE  job  families.  Not  surprisingly,  given  the  already  high  R2  values 
for  the  CE  clusters,  this  procedure  resulted  in  little  or  no  improvement  in  average  weighted  R2 
values. 

Given  these  disappointing  results  for  the  SE  clustering  method,  it  became  obvious  that 
there  was  nothing  to  gain  in  retaining  this  condition  in  the  basic  research  design  (Design  A-l). 
Without  any  viable  SE  job  families,  it  was  not  desirable  to  include  this  condition  as  part  of  the 
model  sampling  experiment.  This  change  meant  that  Design  A-l  was  modified  to  contain  only 
two  independent  variables  forming  a  3  x  3  design.  Designs  A-2  and  A-3  were  not  affected  by 
these  changes.  For  Design  A-2,  the  CE  job  clusters  represented  the  "empirical"  method  of 
clustering  to  be  compared  with  the  Army’s  operational  clusters  (see  Table  2).  Additionally, 
there  would  have  been  no  point  in  using  this  SE  algorithm  with  the  60  jobs  in  Design  B,  so  that 
the  CE  job  clustering  method  also  represented  the  "empirical"  method  of  clustering  for  Design 
B. 

B.  Model  Sampling  Experiment  Results 

The  model  sampling  experiment  involved  the  simulated  assignment  of  20  cross-samples 
of  entities  to  job  families  under  12  different  experimental  assignment  conditions  for  Design  A 
and  6  different  experimental  assignment  conditions  for  Design  B.  Table  19  shows  the  MPP 
standard  scores  averaged  across  the  20  replications  for  each  assignment  condition  for  Designs 
A-l,  A-2,  and  A-3.  Table  20  shows  the  MPP  standard  scores  averaged  across  the  20 
replications  for  each  assignment  condition  for  Design  B. 

Before  performing  any  statistical  tests  on  these  results,  it  was  first  necessary  to  separate 
out  the  effects  due  to  classification  from  the  effects  due  to  selection.  This  research  is  concerned 
with  demonstrating  the  benefits  in  terms  of  increased  classification  effects  under  differing 
experimental  conditions.  For  this  reason,  it  was  desirable  to  subtract  out  of  the  MPP  values  the 
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Table  19 

Means  and  Standard  Deviations  of  MPP  Standard  Scores  for  all  Conditions 
in  Design  A 


DESIGN  A-l:  Classification-Efficient  Job  Clustering  Method/ Assignment  with 
FLS  Composites 


Project  A  Project  A 

McLaughlin 

Exp.  Batt. 

ASVAB 

ASVAB 

D1 

D2 

D3 

J 1 :  6  Job  Families 

.464 

.416 

.278 

(.037) 

(.045) 

(.036) 

J2:  9  Job  Families 

.539 

.470 

.261 

(.039) 

(.046) 

(.039) 

J3:  12  Job  Families 

.592 

.502 

.286 

(.049) 

(.052) 

(.040) 

DESIGN  A-2:  Empirical  (CE  Clustering)  versus  Army  Operational  Job  Families/ 

Assignment  with  FLS  Composites 

Project  A 

Project  A 

Exp.  Batt. 

ASVAB 

D1 

D2 

Ml:  Empirical* 

.539 

.470 

9  Job  Families 

(.039) 

(.046) 

M2:  Operational 

.505 

.439 

9  Job  Families 

(.048) 

(.041) 

DESIGN  A-3:  Army  Operational  Job  Families/ Assignment  with  Aptitude  Areas 


Project  A 
Aptitude  Areas 
D1 


M2:  Operational  .317 

9  Job  Families  (.050) 

Note.  Standard  deviations  appear  in  parentheses  below  means. 
‘Values  for  Empirical  conditions  come  directly  from  Design  A-l. 
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Table  20 


Means  and  Standard  Deviations  of  MPP  Standard  Scores  for  all  Conditions 
in  Design  B 


DESIGN  B:  McLaughlin  Data  with  60  Jobs/Assignment  with  FLS  Composites 

Clustering  Method 


Empirical* 

Operational 

Ml 

M2 

Jl:  9  Job  Families 

.480 

.349 

(.033) 

(.032) 

J2:  16  Job  Families 

.545 

.472 

(.036) 

(.037) 

J3:  23  Job  Families 

.588 

.511 

(.034) 

(.034) 

Note.  Standard  deviations  appear  in  parentheses  below  means. 
'Classification-efficient  clustering  method. 
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contribution  due  to  selection  effects  leaving  only  the  contribution  due  to  classification.  This 
process  can  be  accomplished  by  using  the  Naylor-Shine  equation  and  table  for  determining  the 
increase  in  mean  criterion  score  obtained  by  using  a  selection  device  (Naylor  &  Shine,  1965). 
The  basic  equation  underlying  the  Naylor-Shine  approach  is: 

Zyi  =  -4 


Zyi  =  the  mean  criterion  score  (in  standard  score 
units)  of  all  cases  above  predictor  cutoff 
tx y  =  the  validity  coefficient 

d,  =  the  ordinate  of  the  normal  distribution  at 
the  predictor  cutoff 
s,  =  the  selection  ratio. 


A  selection  ratio  of  .75  was  used  in  all  of  the  simulations  conducted  for  this  research. 
In  addition,  the  AFQT  was  used  as  the  selection  device  so  it  is  the  validity  of  the  AFQT  that  is 
used  in  this  equation.  The  average  AFQT  validity  for  the  Project  A  concurrent  validation  data 
sources  was  calculated  as  0.531.  The  average  AFQT  validity  for  the  "McLaughlin"  data  with 
18  jobs  was  0.4905.  These  validities  are  different  because  the  Project  A  concurrent  validation 
data  utilizes  the  CTP  criterion  and  the  "McLaughlin"  data  set  utilizes  the  SQT  criterion.  The 
average  AFQT  for  the  "McLaughlin"  data  with  60  jobs  was  0.504.  Calculation  of  Zy,  yields  an 
expected  MPP  due  to  selection  alone  of  0.225  for  the  Project  A  data  sources,  0.2078  for  the 
"McLaughlin"  data  set  with  18  jobs,  and  0.214  for  the  "McLaughlin"  data  set  with  60  jobs. 
These  constant  values  were  subtracted  from  the  MPP  standard  score  values  for  the  appropriate 
conditions  shown  in  Tables  19  and  20.  The  revised  set  of  means  representing  only  the  effects 
due  to  classification  are  shown  in  Tables  21  and  22. 

In  the  next  sections,  the  results  shown  in  Tables  21  and  22  are  discussed  in  terms  of  the 
expected  findings  stated  earlier.  The  discussion  of  the  statistical  analyses  will  be  presented  first 
for  Design  A  and  then  for  Design  B. 

1.  Design  A:  Number  of  Job  Families 


One  of  the  primary  expected  findings  of  this  research  states  that  the  magnitude  of  the 
MPP  scores  will  increase  significantly  as  the  number  of  job  families  increases  from  6  to  9  and 


Table  21 


Means  of  MPP  Standard  Scores  for  all  Conditions  in  Design  A  for  Effects 
Due  Only  to  Classification 


DESIGN  A-l:  Classification-Efficient  Job  Clustering  Method/ Assignment  with 
FLS  Composites 


Project  A 

Exp.  Batt. 

D1 

Project  A 

ASVAB 

D2 

McLaughlin 

ASVAB 

D3 

Jl:  6  Job  Families 

.239 

.191 

.070 

J2:  9  Job  Families 

.314 

.245 

.054 

J3:  12  Job  Families 

.367 

.277 

.078 

DESIGN  A-2:  Empirical  (CE  Clustering)  versus  Army  Operational  Job  Families/ 
Assignment  with  FLS  Composites 

Project  A 

Exp.  Batt. 

D1 

Project  A 
ASVAB 
D2 

Ml:  Empirical11 

9  Job  Families 

.314 

.245 

M2:  Operational 

9  Job  Families 

.280 

.214 

DESIGN  A-3:  Army  Operational  Job  Families/Assignment  with  Aptitude  Areas 


Project  A 
Aptitude  Areas 
D1 

M2:  Operational  .092 

9  Job  Families 

‘Values  for  Empirical  conditions  come  directly  from  Design  A-l. 
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Table  22 


Means  of  MPP  Standard  Scores  for  all  Conditions  in  Design  B  for  Effects 
Due  Only  to  Classification 


DESIGN  B:  McLaughlin  Data  with  60  Jobs/ Assignment  with  FLS  Composites 

Clustering  Method 


Empirical11 

Operational 

Ml 

M2 

Jl: 

9  Job  Families 

.266 

.135 

J2: 

16  Job  Families 

.331 

.2S8 

J3:  23  Job  Families 


.374 


.297 


then  to  12  job  families.  On  simple  inspection  of  Table  21,  it  can  be  seen  that  the  rank  order  of 
magnitude  of  MPP  scores  fell  in  the  hypothesized  direction  for  two  of  the  three  data  sources. 

The  statistical  significance  of  these  differences  was  addressed  by  performing  a  3  x  3 
repeated  measures  analysis  of  variance.  The  results  from  this  analysis  are  presented  in  Table 
23.  Both  main  effects  and  the  interaction  between  the  data  source  factor  and  the  job  family 
factor  were  significant  (p  <  .0001).  Because  it  was  apparent  that  the  "McLaughlin"  data  source 
did  not  support  the  number  of  job  families  hypothesis,  a  separate  2x3  repeated  measures 
ANOVA  was  subsequently  performed  for  only  the  Project  A  concurrent  validation  data  sources. 
Discussion  of  the  results  for  the  "McLaughlin"  data  will  be  presented  in  a  later  section  on  types 
of  criteria. 

The  results  from  the  subsequent  analysis  on  only  the  Project  A  concurrent  validation  data 
sources  are  presented  in  Table  24.  Once  again,  both  main  effects  and  the  interaction  between 
the  data  source  factor  and  the  job  family  factor  were  significant.  Thus,  for  the  two  Project  A 
data  sources,  the  significant  main  effect  of  J  allowed  the  null  hypothesis  of  no  difference 
between  the  means  for  6,  9,  and  12  job  families  to  be  rejected  with  a  high  level  of  confidence 
(p  <  .0001).  The  significant  interaction  term  indicates  that  the  two  Project  A  data  sources  were 
affected  differently  by  the  increase  from  6  to  9  to  12  job  families.  Examination  of  the  data 
revealed  that  the  Proj.A  (Exp.Batt)  data  source  resulted  in  slightly  greater  increases  in  MPP 
from  6  to  9  to  12  job  families  than  did  the  Proj.  A  (ASVAB)  data  source. 

A  second  expected  finding  investigated  in  the  research  involving  the  number  of  job  family 
conditions  predicts  that  the  efficiency  of  classification  will  vary  according  to  a  negatively 
accelerated  function  such  that  the  increase  in  MPP  from  6  to  9  job  families  will  be  greater  than 
the  increase  in  MPP  from  9  io  12  job  families.  From  examination  of  the  means  in  Table  21, 
this  hypothesis  appears  to  hold  for  the  Project  A  data  sources.  This  hypothesis  was  statistically 
tested  with  the  use  of  paired-comparison  t-tests  for  each  of  the  Project  A  data  sources.  The 
difference  between  each  MPP  value  from  6  to  9  job  families  was  compared  to  the  difference 
between  each  MPP  value  irom  9  to  12  job  families.  The  null  hypothesis  states  that  the 
difference  between  these  two  comparisons  is  zero.  Confirmation  of  the  hypothesis  occurs  if  the 
difference  from  6  to  9  job  families  is  greater  than  difference  from  9  to  12.  Thus,  a  one-tailed 
t-test  is  appropriate  for  this  situation. 
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Table  23 


Repeated  Measures  ANOVA  of  MPP  Standard  Scores  for  Design  A 


Sources  of 
Variation 

SS 

df 

MS 

F 

P 

^(data  tourcc) 

1.823 

2 

0.9114 

1381.85 

<.0001 

Error  (d) 

0.0250627 

38 

0.0006595 

J  (tf  of  job  families) 

0.1650 

2 

0.0825 

229.12 

<.0001 

Error  (j) 

0.013684 

38 

0.00036011 

D  x  J 

0.0831 

4 

0.0208 

102.28 

<.0001 

Error  (dj) 

0.0154 

76 

0.00020309 

Note.  A  two  factor  repeated  measure  design  is  described  by  Winer,  Brown  and  Michels 
(1991)  p.  561.  Error  (d)  equals  Dx  subj.  w.  groups. 


74 


Table  24 


Repeated  Measures  ANOVA  of  MPP  Standard  Scores:  Comparison 
of  Proj.A  (Exp.  Batt)  and  Proi.A  (ASVAB)  Data  Sources  for  all 
Job  Family  Levels 


Sources  of  SS  df  MS  F  p 

Variation 


D(<ut»  Kwrce) 

0.1438 

1 

Error  (d) 

0.0091942 

19 

of  job  familie*) 

0.2331 

2 

Error  (j) 

0.0090529 

38 

D  x  J 

0.0087 

2 

Error  (dj) 

0.00818 

38 

0.1438 

0.0004839 

297.24 

<.0001 

0.1165 

0.0002382 

489.16 

<.0001 

0.0044 

0.0002153 

20.28 

<.0001 

Note.  A  two  factor  repeated  measure  design  is  described  by  Winer,  Brown  and  Michels 
(1991)  p.  561.  Error  (d)  equals  Dx  subj.  w.  groups. 
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As  predicted,  for  the  Proj.A  (Exp.Batt),  the  difference  from  6  to  9  job  families  was 
significantly  greater  than  the  difference  from  9  to  12  job  families,  1(19)  =  3.29,  p  <  .003.  For 
the  Proj.A  (ASVAB)  data  source,  the  difference  from  6  to  9  job  families  was  also  significantly 
greater  than  the  difference  from  9  to  12  job  families,  1(19)  =  3.12,  p  <  .005.  Thus,  there  does 
appear  to  be  evidence  from  these  results  to  support  the  proposition  that  the  efficiency  of 
classification  varies  with  the  number  of  job  families  according  to  a  negatively  accelerated 
function. 

2.  Design  A:  Job  Clustering  Methods 

Initially,  one  of  the  major  expected  findings  of  this  research  was  that  the  CE  method  of 
job  clustering  would  result  in  significantly  greater  MPP  scores  than  the  SE  method.  Because 
the  SE  job  clustering  method  was  not  successful  in  producing  job  clusters  that  maximized 
selection-efficiency,  the  SE  algorithm  was  no  longer  a  credible  alternative  and  this  hypothesis 
was  no  longer  worth  testing.  However,  another  major  expected  finding  of  this  research  stated 
that  an  empirical  method  of  clustering  (i.e.,  the  CE  method)  would  result  in  significantly  greater 
MPP  scores  than  the  current  U.S.  Army  operational  job  families.  This  hypothesis  is  testable 
and  is  represented  in  Table  21  as  Design  A-2. 

On  simple  inspection  of  Table  21,  one  can  see  that  the  operational  MPP  standard  scores 
(i.e.,  the  results  obtained  on  the  9  operational  families)  for  Design  A-2  are  lower  than  the 
empirical  MPP  standard  scores  (i.e.,  the  results  obtained  for  the  CE  families)  for  both  data 
source  conditions.  Table  25  provides  the  results  for  the  repeated  measures  ANOVA  of  Design 
A-2.  Note  that  the  main  effects  for  the  clustering  methods  factor  and  the  data  source  factor  are 
both  significant.  Thus,  there  is  support  for  the  hypothesis  that  the  empirical  CE  method  of 
clustering  resulted  in  significantly  greater  MPP  scores  than  the  operational  job  family  method. 
Note  that  the  interaction  between  data  source  and  clustering  method  is  not  significant.  From  the 
means  in  Table  21  for  Design  A-2,  it  is  apparent  that  the  differences  between  the  empirical  and 
operational  methods  were  virtually  identical  for  both  data  sources  eliminating  any  interaction 
effect. 

3.  Design  A;  Type  of  Predictor  Measure 

The  availability  of  the  Project  A  concurrent  validation  experimental  battery  (20 
experimental  predictors  added  to  the  9  ASVAB  tests)  allowed  a  determination  of  the  effects  on 
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Table  25 


Repeated  Measures  ANOVA  of  MPP  Standard  Scores  for  Design  A-2 


Sources  of 
Variation 

SS 

df 

MS 

F 

P 

Murce) 

0.0907 

1 

0.0907 

252.23 

<.0001 

Error  (d) 

0.0068320 

19 

0.0003596 

^f(chulenaf  method) 

0.0209 

1 

0.0209 

85.87 

<.0001 

Error  (m) 

0.0046182 

19 

0.0002431 

DxM 

0.00004 

1 

0.00004 

0.11 

<.7394 

Error  (dm) 

0.0071946 

19 

0.0003787 

Note.  A  two  factor  repeated  measure  design  is  described  by  Winer,  Brown  and  Michels 
(1991)  p.  561.  Error  (d)  equals  D  x  subj.  w.  groups. 
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MPP  of  expanding  the  predictor  space  to  include  spatial,  psychomotor,  biodata,  and  interest 
predictors.  The  use  of  the  full  set  of  experimental  plus  ASVAB  predictors  (29  tests)  for  job 
clustering  and  assignment  was  compared  to  the  use  of  only  the  9  ASVAB  tests.  As  discussed 
earlier,  the  composition  of  the  job  families  formed  using  the  Proj.A  (Exp.Batt)  predictors  shared 
some  similiarities  to  the  job  families  formed  using  the  Proj.A  (ASVAB)  predictors.  In  terms 
of  the  resulting  MPP  scores  after  assignment,  however,  note  from  Table  21  that  in  all  cases  the 
Proj.A  (ASVAB)  means  are  lower  than  the  Proj.A  (Exp.Batt)  means.  In  addition,  from  the 
repeated  measures  ANOVAs  presented  in  Tables  24  and  25,  it  is  apparent  that  this  data  source 
factor  is  significant  for  all  conditions  in  both  Designs  A-l  and  A-2.  Thus,  there  is  evidence  to 
support  the  hypothesis  that  when  assignment  variables  are  based  on  all  29  Project  A  tests,  MPP 
will  be  significantly  greater  than  when  the  assignment  variables  are  based  on  the  standard  9 
ASVAB  tests. 

4^  Design  A;  Type  of  Criterion  Measure 

The  purpose  of  matching  the  same  18  jobs  in  the  Project  A  data  and  the  "McLaughlin" 
data  for  Design  A  was  to  directly  compare  the  effects  on  classification  efficiency  of  using  more 
routinely  and  inexpensively  collected  criterion  (i.e.,  SQTs  and  training  grades)  with  the  specially 
designed  Project  A  criteria  (i.e.,  CTP).  It  was  hypothesized  that  there  would  be  no  significant 
difference  in  MPP  scores  due  to  the  use  of  assignment  variables  based  on  the  Project  A  criterion 
and  the  "McLaughlin"  criterion. 

As  Table  21  shows,  the  mean  MPP  scores  after  removal  of  selection  effects  are  much 
lower  for  the  "McLaughlin"  data  condition  than  for  the  Project  A  (ASVAB)  condition.  Table 
26  shows  a  2  x  3  repeated  measures  ANOVA  of  the  MPP  standard  scores  comparing  the  Proj.A 
(ASVAB)  and  "McLaughlin"  data  sources.  These  two  data  sources  were  isolated  for  comparison 
across  the  job  family  conditions  because  they  both  share  the  same  predictors  (i.e.,  the  ASVAB) 
but  differ  in  terms  of  criterion  measures.  Table  26  shows  that  contrary  to  the  stated  hypothesis 
of  no  difference,  there  was  a  significant  difference  between  the  data  sources. 

An  additional  prediction  was  also  stated  to  allow  for  statistically  significant  differences 
between  the  two  data  sources  (Project  A  concurrent  and  "McLaughlin"),  if  the  conclusions 
reached  about  the  other  expected  findings  in  this  research  were  the  same  with  both  data  sources. 
In  other  words,  if  the  conclusions  about  the  effect  on  classification  efficiency  of  increasing  the 
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Sources  of  SS  df  MS  F  p 

Variation 


f^(d*t*  source) 

0.8696 

1 

Error  (d) 

0.0172847 

19 

of  job  families) 

0.0451 

2 

Error  (j) 

0.0131771 

38 

Dx  J 

0.0374 

2 

Error  (dj) 

0.0077492 

38 

0.8696 

0.0009097 

955.87 

<.0001 

0.0226 

0.0003468 

65.06 

<.0001 

0.0187 

0.0002039 

91.70 

<.0001 

Note.  A  two  factor  repeated  measure  design  is  described  by  Winer,  Brown  and  Michels 
(1991)  p.  561.  Error  (d)  equals  Dx  subj.  w.  groups. 
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number  of  job  families  were  the  same  for  both  "McLaughlin"  and  Project  A  concurrent  data, 
but  the  "McLaughlin"  MPP  values  were  somewhat  lower  overall,  this  prediction  could  still  be 
supported. 

Unfortunately,  even  this  second  hypothesis  did  not  hold.  From  Table  21,  one  can  see 
that  the  "McLaughlin"  data  source  resulted  in  a  reversal  of  MPP  values  from  6  to  9  job  families. 
Table  27  presents  a  repeated  measures  ANOVA  for  just  the  "McLaughlin"  data  source  across 
all  three  job  family  levels.  Note  from  Table  27  that  although  the  mean  differences  across  the 
job  family  conditions  (see  Table  21)  appear  very  slight,  they  were  statistically  significant. 

Additional  statistical  comparisons  of  the  job  family  levels  for  the  "McLaughlin"  data 
revealed  that  the  6  job  family  condition  (Jl)  was  not  significantly  different  from  the  12  job 
family  condition  (J3),  E(l,19)  =  1.43,  p  <  .2468.  However,  the  9  job  family  condition  (J2) 
was  significantly  different  from  both  the  6  job  family  condition,  £(1,19)  =  9.28,  p  <  .0066, 
and  the  12  job  family  condition,  F(l,19)  =  33.66,  p  <  .0001. 

These  Design  A  results  are  unfortunate  in  that  they  do  not  provide  preliminary  evidence 
of  the  usefulness  of  the  "McLaughlin"  validity  data  with  the  SQT  criterion  for  the  structuring 
of  jobs  into  families.  However,  the  usefulness  of  Design  B  with  its  60  job  permits  the 
examination  of  important  methodological  issues,  even  if  we  cannot  argue  for  the  immediate 
usefulness  of  SQT  validity  data  as  the  primary  basis  of  a  restructuring  of  Army  job  families. 
The  results  found  for  Design  A  with  only  18  jobs  for  the  "McLaughlin"  data  could  be  caused 
by  the  interaction  of  a  variety  of  factors.  First,  with  only  18  jobs  the  somewhat  poorer 
psychometric  properties  of  the  SQT  criterion  (e.g. ,  criterion-referenced,  lack  of  discriminability) 
compared  to  the  CTP  criterion  could  have  contributed  to  the  inconsistent  findings.  Recall  that 
there  was  an  unexpected  reversal  of  the  results  for  the  intercorrelations  among  the  LSEs,  r,  for 
the  "McLaughlin"  data  with  18  jobs.  These  reversals  in  the  intercorrelation  magnitudes  due  to 
the  psychometric  properties  of  the  data  could  be  an  explanation  for  these  inconsistent  findings. 
From  Brogden’s  (1959)  formulation  it  is  known  that  r  is  an  important  component  in  the 
estimation  of  MPP.  Thus,  although  Horst’s  differential  index,  Hd,  and  the  predictive  validity, 
R2,  indicated  that  MPP  should  increase  as  the  number  of  job  families  increased,  the  average 
intercorrelations  among  the  LSEs  across  the  conditions  indicated  otherwise. 
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Table  27 


Sources  of 
Variation 

SS 

df 

MS 

F 

P 

of  job  fnnilies) 

0.0063 

2 

0.0032 

10.08 

<.0003 

Error  0) 

0.01188474 

38 

0.00031276 

Note.  Error  (j)  equals  Jx  subj.  w.  groups. 


Second,  it  is  important  to  recall  that  in  Design  A,  although  the  "McLaughlin"  data  with 
the  SQT/training  criterion  was  used  for  job  clustering  and  calculation  of  assignment  weights, 
evaluation  of  assignment  and  calculation  of  the  MPP  results  was  based  on  the  designated 
population  values.  This  design  was  used  to  carefully  control  the  error  that  results  from  the  use 
of  the  same  sample  for  computing  weights  for  assignment  and  evaluation.  The  designated 
population  for  Design  A  was  the  Project  A  concurrent  validation  data  with  the  full  set  of  29 
predictors  and  the  CTP  criterion.  Thus,  Design  A  was  based  on  the  assumption  that  the  Project 
A  population  parameters  for  both  predictors  and  criterion  represented  "truth"  in  the  population. 
Using  different  content  (e.g.,  CTP  criterion)  to  evaluate  the  effect  of  assignment  variables 
obtained  using  the  "McLaughlin”  data  (with  the  SQT/training  criteria)  is  die  major  factor 
contributing  to  the  overall  low  MPP  values  obtained  using  the  "McLaughlin"  data  as  the  source 
of  assignment  variables  in  Design  A. 

Finally,  because  of  the  desire  to  have  the  same  18  jobs  included  for  "McLaughlin"  data 
source  condition  as  were  contained  in  the  Project  A  data  for  Design  A,  some  compromises  were 
made  that  may  have  influenced  the  results.  Three  out  of  the  18  jobs  were  included  that  had  end- 
of-course  training  scores  as  criterion  instead  of  SQT  scores.  Even  though  every  attempt  was 
made  to  equalize  these  two  criterion  sources,  the  training  scores  are  known  to  have  even  poorer 
psychometric  properties  than  the  SQT  criterion.  In  Design  B,  these  same  three  jobs  with  the 
training  criterion  were  also  included,  but  there  were  an  additional  57  jobs  instead  of  15  jobs  so 
that  the  influence  of  these  three  jobs  should  not  be  as  great. 

The  conclusions  from  Design  A  regarding  SQT  apply  to  a  methodological  study  that 
assumes  that  the  concurrent  validation  Project  A  CTP  criterion  represents  "truth"  in  the 
population.  In  Design  B,  we  use  the  SQT  to  investigate  methodological  hypotheses  because  it 
was  the  best  sample  of  validities  available  over  a  large  number  of  jobs.  Alternatively,  we  could 
have  constructed  values  for  a  validity  matrix  by  judgment  and  by  aggregating  data  from  several 
other  studies.  Such  an  approach  could  have  been  defended  for  this  methodological  study. 
However,  we  wanted  a  matrix  of  validities  which  would  behave  as  much  as  possible  like  real 
validity  coefficients.  In  addition,  in  Design  B,  remember  that  the  SQT  criterion  was  used  for 
computing  both  assignment  weights  and  evaluation  weights.  Under  these  circumstances,  we 
believe  that  the  validity  matrices  based  on  the  SQT  criterion  have  the  same  statistical 
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characteristics  (although  on  the  average  somewhat  lower)  as  would  similar  validities,  if  available 
based  on  the  CTP  criterion.  Therefore,  the  findings  from  Design  A  did  not  deter  us  from 
proceeding  with  a  similar  methodological  study  for  Design  B  using  the  SQT  criterion. 

5.  Design  A:  FLS  Assignment  versus  Aptitude  Area  Assignment 

The  present  research  presented  an  opportunity  to  compare  the  use  of  FLS  composites  for 
assignment  directly  to  the  use  of  the  aptitude  area  composites.  Table  21  (Design  A-3)  gives  that 
part  of  the  MPP  score  remaining  after  that  part  of  the  total  MPP  due  to  selection  has  been 
subtracted  for  the  condition  in  which  entities  were  assigned  to  the  9  operational  job  families  with 
the  use  of  the  Army  aptitude  area  composites.  Note  that  this  value  was  very  low  indicating  very 
little  classification  potential  when  the  current  aptitude  area  composites  are  used  for  assignment. 

The  expected  finding  stated  that  any  condition  involving  FLS  assignment  would  result  in 
significantly  greater  MPP  scores  than  assignment  based  upon  the  Army  operational  aptitude  area 
composite.  By  comparing  the  single  cell  result  in  Design  A-3  with  the  lowest  MPP  using 
Project  A  data  from  either  of  the  other  two  designs  (A-l  or  A-2),  it  is  possible  to  conclude  that 
any  condition  involving  FLS  assignment  is  better  than  aptitude  area  assignment.  The  lowest 
MPP  score  for  either  Design  A-l  or  A-2  using  Project  A  data  resulted  when  the  Proj.A 
(ASVAB)  data  was  used  for  FLS  assignment  to  the  6  CE  job  families.  As  expected,  assignment 
with  FLS  for  this  lowest  condition  (M  =  .191)  was  significantly  better  than  assignment  with 
aptitude  area  composites  (M  =  .092),  £(1,19)  =  680.18,  p  <  .0001.  Thus,  there  was 
substantial  support  from  these  results  for  the  hypothesis  that  FLS  assignment  resulted  in 
significantly  greater  MPP  scores  than  aptitude  area  assignment. 

6.  Design  B;  Number  of  Job  Families 

From  Table  22,  it  is  apparent  that  the  magnitude  of  the  MPP  scores  increased  as  the 
number  of  job  families  increased  from  9  to  16  and  then  to  23  job  families.  The  statistical 
significance  of  these  differences  was  addressed  by  performing  a  2  x  3  repeated  measures  analysis 
of  variance.  The  results  from  this  analysis  are  presented  in  Table  28.  Both  main  effects  and 
the  interaction  between  the  clustering  method  factor  and  the  job  family  factor  were  significant 
(p  <  .0001).  The  significant  interaction  term  indicates  that  the  two  clustering  methods  were 
affected  differently  by  the  increase  from  9  to  16  to  23  job  families.  Examination  of  the  data 
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Table  28 


Repeated  Measures  ANOVAof  MPPitandard  Scores_for  Desgin  B 


Sources  of 
Variation 

SS 

df 

MS 

F 

P 

D  (dimering  methodi) 

0.2633 

1 

0.2633 

6297.30 

<.0001 

Error  (d) 

0.0007945 

19 

0.0000418 

of  job  families) 

0.3814 

2 

0.1907 

3289.39 

<.0001 

Error  (j) 

0.0022031 

38 

0.00005798 

D  x  J 

0.0211 

2 

0.0105 

330.65 

<.0001 

Error  (dj) 

0.00121233 

38 

0.0000319 

Note.  A  two  factor  repeated  measure  design  is  described  by  Winer,  Brown  and  Michels 
(1991)  p.  561.  Error  (d)  equals  Dx  subj.  w.  groups. 
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revealed  that  the  use  of  the  operational  clusters  resulted  in  greater  increases  in  MPP  from  9  to 
16  job  families  than  did  the  empirical  clusters. 

Similar  to  Design  A,  it  was  expected  that  the  efficiency  of  classification  would  vary  with 
the  number  of  job  families  according  to  a  negatively  accelerated  function  sue h  that  the  increase 
in  MPP  from  9  to  16  job  families  would  be  greater  than  the  increase  in  MPP  from  16  to  23  job 
families.  From  examination  of  the  means  in  Table  22,  it  appears  that  there  is  support  for  this 
hypothesis.  A  paired  comparison  t-test  was  performed  for  each  of  the  clustering  methods 
(empirical  vs.  operational).  The  difference  between  each  MPP  value  from  9  to  16  job  families 
was  compared  to  the  difference  between  each  MPP  value  from  16  to  23  job  families.  Once 
again,  a  one-tailed  t-test  was  used  for  this  statistical  test. 

As  predicted,  for  the  empirical  CE  clustering  condition,  the  difference  from  9  to  16  job 
families  was  significantly  greater  than  the  difference  from  16  to  23  job  families,  1(19)  =  6.87, 
E  <  .0001.  For  the  operational  clustering  condition,  the  difference  from  9  to  16  job  families 
was  also  significantly  greater  than  the  difference  from  16  to  23  job  families,  1(19)  —  18.97,  p 
<  .0001.  Thus,  once  again,  there  does  appear  to  be  evidence  from  these  results  to  support  the 
proposition  that  the  efficiency  of  classification  varies  with  the  number  of  job  families  according 
to  a  negatively  accelerated  function. 

7.  Design  Clustering  Methods 

It  was  expected  that  the  empirical,  classification-efficient  method  of  clustering  would 
result  in  significantly  greater  MPP  scores  than  the  operational  methods  of  clustering  (aptitude 
areas,  CMF  categories,  and  a  combination  of  these  two  groupings).  Upon  simple  inspection 
of  Table  22,  it  is  apparent  that  for  all  job  family  conditions  the  empirical  MPP  scores  were 
greater  than  the  operational  MPP  scores.  The  repeated  measures  analysis  of  variance  presented 
earlier  in  Table  28  also  provides  the  statistical  test  for  this  set  of  conditions.  Note  from  Table 
28  that  the  main  effect  for  the  clustering  methods  factor  was  significant  (p  <  .0001).  The 
interaction  between  the  clustering  methods  factor  and  the  job  family  factor  noted  earlier  can  be 
explained  further  by  noting  that  the  differences  in  mean  MPP  between  the  empirical  and 
operational  methods  were  fairly  consistent  for  the  16  and  23  job  family  conditions  (i.e.,  .073 
and  .077,  respectively).  However,  the  differences  in  mean  MPP  between  the  empirical  and 
operational  methods  for  the  9  job  family  condition  were  almost  double  at  .131.  This  indicates 
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that  the  operational  9  job  families  based  upon  the  aptitude  areas  performed  even  more  poorly 
than  expected  resulting  in  an  interaction  effect. 

8.  Effect  of  Sample  Size 

This  study  also  contributes  to  our  knowledge  of  the  effect  that  analysis  sample  size  has 
on  MPP  computed  in  independent  samples.  This  effect  is  true  for  research  designs  in  which  the 
regression  weights  for  FLS  composites  are  computed  in  the  analysis  sample  and  MPP  computed 
in  one  or  more  cross-samples.  This  knowledge  is  based  on  inclusive  information  that  relates  to 
only  one  condition  of  Design  A  —  the  condition  in  which  FLS  experimental  composites  are  the 
AVs  to  optimally  assign  entities  to  12  CE  empirically  formed  families. 

Design  A  used  analysis  samples  to  form  AVs  which  could  be  characterized  as  moderate 
in  size.  A  companion  study  by  Whetzel  (1991)  exists  in  which  the  analysis  sample  used  to 
compute  the  AVs  for  use  in  a  comparable  condition  is  infinitc’y  large.  The  use  of  the  designated 
population  as  the  analysis  sample  implies  that  the  analysis  sample  is  infinitely  large.  We  wish 
to  estimate  the  correction  factor  which  should  be  applied  to  values  of  MPP  computed  using 
analysis  samples  of  infinite  size  in  order  to  estimate  what  the  value  of  MPP  would  have  been 
if  the  analysis  samples  had  been  of  the  size  used  in  Design  A. 

Whetzel  (1991)  provided  an  MPP  value  for  the  results  of  a  simulation  in  which  the  close 
equivalent  of  FLS -experimental  composites  serve  as  AVs  for  assignment  to  11  out  of  the  18 
jobs.  These  1 1  were  selected  because  they  most  completely  spanned  the  joint  predictor-criterion 
space  defined  by  11  of  the  18  jobs.  Therefore  these  11  jobs  would  be  expected  to  provide  a 
higher  potential  classification  efficiency  than  obtained  in  the  12  job  family  condition  of  Design 
A.  The  MPP  provided  from  classification  effects,  that  is,  after  the  component  of  MPP  due  to 
selection  is  removed,  should  also  be  larger  since  the  Whetzel  (1991)  study  used  a  more  efficient 
selection  variable  ("g"  instead  of  AFQT). 

The  MPP  standard  score  obtained  in  Whetzel  (1991)  for  a  somewhat  comparable 
condition  to  the  Design  A  condition  described  above  is  .722.  This  contrasts  with  the  MPP 
standard  score  for  the  Design  A  condition  of  .592  yielding  a  difference  of .  130  and  a  correction 
factor  of  .180.  As  noted  above,  this  correction  factor  is  an  overestimate  for  the  Design  A  data 
on  two  counts.  Thus,  we  estimate  that  the  proper  value  of  this  correction  factor  lies  between 
.  10  and  .  15.  Since  the  empirical  samples  of  Design  B  have  an  average  size  of  over  twice  those 
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of  Design  A  and  the  AVs  corresponding  to  the  families  are  based  on  a  division  of  60  jobs  among 
the  11  or  12  families,  instead  of  the  division  of  18  jobs,  this  estimate  obtained  from  Design  A 
must  be  an  over  estimate  of  the  correction  factor  for  Design  B.  Considering  all  of  the  above, 
we  estimate  that  the  correction  factor  for  Design  B  (i.e.,  the  multiplier  to  be  applied  to  the 
obtained  MPPs  to  provide  unbiased  estimates  of  MPP),  could  conservatively  be  estimated  to  fell 
between  .05  and  .  10.  This  correction  factor  could  be  applied  to  the  MPP  values  of  Table  22  to 
adjust  for  the  possible  effects  of  correlated  error  between  assignment  and  evaluation  variable 
(even  though  these  values  are  cross-sample  results). 

C.  Estimating  the  Practical  Significance  of  Gains  in  MPP 

Finding  statistical  significance  permits  consideration  of  the  practical  significance  of  the 
MPP  values  across  the  various  experimental  conditions.  It  is  possible  to  calculate  percentage 
gains  in  MPP  for  various  conditions  of  interest  from  the  mean  MPP  values  presented  in  Tables 
21  and  22.  Actual  gains  in  MPP  standard  scores  also  can  be  translated  directly  into  dollar 
estimates  of  the  value  of  increased  productivity  using  the  cost  and  benefits  analyses  of  Nord  and 
Schmitz  (1989,  1991). 

Nord  and  Schmitz  (1989,  1991)  conducted  a  utility  analysis  of  the  merits  of  alternative 
manpower  policies  for  the  U.S.  Army.  Their  goal  was  to  obtain  realistic  estimates  of  the  costs 
and  benefits  of  changing  job  entry  standards  and  allocation  procedures.  To  determine  dollar 
estimates,  Nord  and  Schmitz  (1989,  1991)  used  a  net  present  value  (NPV)  model  for 
performance  valuation  which  is  a  refinement  of  the  approach  developed  by  Brogden  (1951)  and 
developed  further  by  many  other  personnel  testing  researchers  (Boudreau,  1983;  Cascio,  1987; 
Hunter  &  Schmidt,  1982). 

Nord  and  Schmitz  (1989,  1991)  found  that  optimal  assignment  using  FLS  prediction 
(based  on  aptitude  areas  rather  than  ASVAB  test  scores)  resulted  in  a  .143  increase  in  mean 
predicted  performance  over  assignment  using  the  current  Army  selection  and  classification 
system.  The  net  economic  value  of  this  gain  was  estimated  to  be  $262  million  for  one  year. 
This  estimate  will  be  used  to  provide  a  basis  for  extrapolating  utility  estimates  from  the  results 
of  the  present  study. 
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In  using  this  estimate  for  the  present  study,  it  is  assumed  that  the  dollar  values  are  linear 
throughout  the  range  of  performance  values.  It  is  also  important  to  point  out  that  Nord  and 
Schmitz’s  costing  estimates  were  based  on  1988  costs.  Thus,  the  $262  million  would  be  a  slight 
underestimate  of  today’s  value.  However,  Nord  and  Schmitz  based  their  estimates  on  1984 
entry-level  accessions  of  120,000  per  year.  This  rate  of  accession  has  actually  declined  over  the 
past  several  years.  This  would  suggest  that  the  $262  million  is  a  slight  overestimate  of  today’s 
values.  For  the  purposes  of  approximating  economic  values  for  the  present  research,  it  will  be 
assumed  that  these  two  factors  tend  to  balance  one  another  and  that  the  $262  million  is  a  good 
estimate  for  extrapolation  purposes. 

Interpreting  gains  in  MPP  with  dollar  value  estimates  is  important  because  it  gives  some 
concrete  meaning  to  the  increases  observed  in  MPP  values.  Dollar  value  estimates  provide  a 
scale  for  comparison  of  alternative  policies  and  comparison  to  other  research  in  this  same  area. 
The  dollar  values  that  are  given,  however,  should  not  be  considered  absolute  expected  increases 
since  each  organization  needs  to  determine  its  own  dollar  values.  Although  these  dollar  value 
estimates  were  extrapolated  directly  from  the  comprehensive  utility  analysis  completed  by  Nord 
and  Schmitz  (1989,  1991),  it  is  important  to  point  out  that  there  were  certain  operational 
constraints  that  were  not  taken  into  account  in  this  simulation  that  could  affect  the  utility  benefits 
to  the  Army.  For  example,  in  this  simulation,  equal  quotas  per  job  were  used  instead  of 
operational  quotas  needed  by  the  Army,  and  each  job  was  considered  to  have  equal  value  to  the 
organization.  In  addition,  it  was  not  possible  to  take  into  account  operational  constraints  such 
as  sex  (certain  Army  jobs  are  not  open  to  women  and  both  combat  support  and  combat  service 
support  units  have  quotas  for  women  soldiers)  and  availability  of  training  slots.  Also,  this 
simulation  assumed  that  individuals  would  accept  the  job  to  which  they  have  been  optimally 
assigned.  Since  the  Army  is  a  volunteer  system,  this  is  an  operational  problem  that  must  be 
confronted  in  setting  up  an  efficient  selection/classification  system.  This  last  problem  is 
mitigated  somewhat  with  the  use  of  job  families.  Assignment  to  job  families  and  then 
consideration  of  preferences  in  the  further  assignment  to  the  jobs  within  these  families  could 
allow  for  enough  freedom  of  choice  to  satisfy  many  potential  recruits. 

Table  29  gives  the  differences  and  percentage  gains  between  the  average  MPP  scores 
from  Tables  21  and  22  for  all  of  the  major  comparisons  for  the  hypotheses  stated  earlier.  It  is 
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Table  29 


Differences  and  Percentage  Gains  in  MPP  scores  for  all  Major  Comparisons 


DESIGN  A 


Number  of 

Job  Families 

Project  A 

Experimental  Battery 
Difference  %  Gain 

Project  A 

ASVAB 

Difference  %  Gain 

Increase  from: 

6  to  9 

.075 

31.3 

.055  28.7 

9  to  12 

.053 

16.9 

.032  12.9 

6  to  12 

.128 

53.5 

.086  45.3 

Clustering 

Method 

Project  A 

Experimental  Battery 
Difference  %  Gain 

Project  A 

ASVAB 

Difference  %  Gain 

Empirical  over 

Operational 

.034 

12.0 

.031  14.4 

Type  of  Predictor 

Average  Across 

Job  Families 

Difference  %  Gain 

Project  A  (Exp.Batt)  over 
Project  A  (ASVAB) 

.069 

29.1 

FLS  versus 

Aptitude  Area  Assignment 

Project  A 

ASVAB 

Difference  %  Gain 

Empirical  w/FLS  over 
Operational  w/AA 

.153 

166.3 

Operational  w/FLS  over 
Operational  w/AA 

.122 

132.6 
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(continued) 


Table  29  (continued) 


DESIGN  B 

Number  of 

Job  Families 

Empirical 

Difference  %  Gain 

Operational 

Difference  %  Gain 

Increase  from: 

9  to  16 

.065 

24.4 

.123  91.1 

16  to  23 

.043 

13.0 

.039  15.1 

9  to  23 

.108 

40.6 

.162  120.0 

Clustering  Method 

Average  Across 

Job  Families 

Difference  %  Gain 

Empirical  over 
Operational 

.094 

40.9 
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apparent  that  increasing  the  number  of  job  families  has  a  significant  effect  on  MPP  scores.  For 
Design  A,  the  percentage  gains  from  increasing  the  number  of  job  families  from  6  to  9  ranged 
from  28%  to  31  % .  The  percentage  gains  from  increasing  the  number  of  job  families  from  9  to 
12  ranged  from  12%  to  17%.  For  Design  B,  increasing  the  number  of  job  families  from  9  to 
16  resulted  in  gains  of  24.4%  in  MPP  when  CE  clustering  was  used,  and  gains  of  91.1%  when 
the  operational  clusters  were  used.  Increasing  the  number  of  job  families  from  16  to  23  resulted 
in  gains  of  13%  in  MPP  when  CE  clustering  was  used,  and  gains  of  15. 1  %  when  the  operational 
clusters  were  used. 

From  Design  A,  it  is  possible  to  address  the  operational  question  of  the  dollar  value  that 
would  be  "lost"  to  the  Army  if  the  number  of  job  families  was  decreased  from  9  to  6. 
Decreasing  the  number  of  job  families  has  been  the  recommendation  of  recent  research  in  the 
Army  (McLaughlin  et.  al,  1984).  The  results  from  the  present  research  show  that,  for  the 
Proj.A  (Exp.Batt)  condition,  the  difference  of  .075  between  9  and  6  job  families  represents 
approximately  $137  million  dollars  per  year  that  could  be  lost  by  decreasing  the  number  of  job 
families.  For  the  Proj.A  (ASVAB)  condition,  the  difference  of  .055  between  9  and  6  families 
represents  approximately  $100  million  dollars  per  year  lost. 

It  is  also  possible  to  address  the  question  of  how  much  improvement  the  Army  could 
expect  by  increasing  the  number  of  job  families.  The  results  from  Design  B  provide  the  most 
dramatic  illustration  of  the  dollar  value  improvements  possible  for  the  Army.  For  example,  note 
that  if  the  current  nine  operational  job  families  (aptitude  areas)  were  abandoned  in  favor  of  16 
operational  job  families  (combination  of  aptitude  area  and  CMF)  a  percentage  gain  of  91.1% 
could  be  expected  which  translates  into  a  $225  million  dollar  per  year  improvement. 
Alternatively,  if  the  current  nine  operational  job  families  (aptitude  areas)  were  abandoned  in 
favor  of  23  operational  job  families  (CMF  categories)  a  gain  of  120%  could  be  expected  which 
translates  into  a  $297  million  dollar  per  year  improvement.  Furthermore,  if  the  Army  utilized 
the  CE  method  of  clustering  developed  in  this  research  to  cluster  jobs  into  23  job  families,  the 
expected  dollar  value  improvement  would  be  approximately  $438  millon  dollars  per  year  over 
the  current  9  operational  job  families. 

Another  comparison  of  interest  is  the  gain  in  MPP  that  can  be  attributed  to  using  the 
empirical  CE  method  of  clustering  instead  of  the  operational  methods  currently  used  by  the 


Army.  From  Design  A,  the  results  showed  gains  of  only  12%  to  14%  for  the  empirical  nine 
job  families  over  the  operational  nine  job  families.  From  Design  B,  with  the  much  more 
realistic  and  broader  range  of  job  families,  the  results  showed  gains  of  40%  when  averaged 
across  job  families.  This  40%  gain  translates  into  a  $172  million  dollar  per  year  improvement 
to  the  Army  that  could  accrue  from  using  a  CE  method  of  clustering  for  forming  job  families. 

Table  29  also  shows  the  difference  between  the  two  types  of  predictors  that  were  part  of 
Design  A.  The  difference  among  the  types  of  predictors  suggest  that  if  the  Army  were  to  adopt 
an  expanded  set  of  experimental  predictors  added  to  the  ASVAB  instead  of  the  ASVAB  alone, 
they  could  expect  MPP  improvements  of  .069  representing  a  29. 1  %  gain  in  MPP.  This  gain 
translates  into  an  improvement  worth  $126  million  per  year. 

Finally,  from  the  section  comparing  FLS  assignment  to  aptitude  area  assignment  in  Table 
29  (only  Design  A),  it  is  apparent  that  FLS  assignment  is  substantially  better  than  aptitude  area 
assignment.  The  results  show  that  if  the  Army  were  to  cluster  jobs  into  nine  job  families  using 
a  CE  method  and  combine  this  with  FLS  assignment  instead  of  their  current  system  (nine 
operational  job  families  with  AA  assignment),  their  estimated  average  increase  in  MPP  would 
be  .153  representing  a  166.3%  gain  in  predicted  performance.  This  MPP  gain  translates  into 
an  improvement  worth  $280  million  dollars  per  year.  The  results  in  Table  29  also  show  that 
even  if  the  Army  were  to  keep  their  current  nine  operational  job  families  but  change  to  FLS 
assignment  using  the  full  ASVAB,  their  estimated  average  increase  in  MPP  would  be  .122 
representing  a  132.6%  gain  in  predicted  performance.  This  MPP  gain  translates  into  an 
improvement  worth  $224  million  dollars  per  year. 

In  summary,  this  section  demonstrated  that  many  of  the  statistically  significant  differences 
among  the  MPP  scores  also  represented  substantial  practical  improvements  in  MPP.  Some  of 
the  greatest  improvements  come  from  increasing  the  number  of  job  families  and  from  using  FLS 
instead  of  aptitude  area  assignment.  Zeidner  and  Johnson  (1989b)  predicted  that  use  of  FLS 
assignment  would  provide  the  greatest  improvements  in  MPP  scores  with  the  second  greatest 
improvements  occurring  by  increasing  the  number  of  job  families.  They  estimated  that 
increasing  the  number  of  job  families  would  provide  a  50%  improvement  above  the  benefits 
from  FLS  assignment.  In  fact,  the  current  study  results  suggest  that  this  figure  is  actually  much 
greater.  If  the  Army  were  to  utilize  the  best  condition  from  Design  A  of  this  study  (i.e.,  a  CE 
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clustering  method  to  increase  their  number  of  job  families  to  12,  the  full  set  of  experimental 
predictors,  and  FLS  assignment)  there  would  be  a  71.3%  improvement  in  MPP  compared  to  the 
use  of  the  current  9  operational  job  families,  the  ASVAB,  and  FLS  assignment.  If  the  Army 
were  to  utilize  the  best  condition  from  Design  B  of  this  study  (i.e.,  a  CE  clustering  method  to 
increase  their  number  of  job  families  to  23,  the  ASVAB,  and  FLS  assignment)  there  would  be 
a  177%  improvement  in  MPP  compared  to  the  use  of  the  current  9  operational  job  families,  the 
ASVAB,  and  FLS  assignment.  These  improvements  can  be  translated  into  added  gains  ranging 
from  $280  to  over  $400  million  per  year  over  and  above  the  benefits  accrued  from  the  addition 
of  FLS  assignment  procedures. 
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IV.  DISCUSSION  AND  CONCLUSIONS 


A.  Implications  For  DAT 

1.  Refinement  of  DAT 

Differential  assignment  theory  (DAT)  was  first  proposed  by  Johnson  and  Zeidner  (1990, 
1991)  and  Johnson,  Zeidner,  and  Scholarios  (1990)  as  the  conceptual  basis  for  initiating  research 
and  interpreting  results  from  research  intended  to  improve  the  efficiency,  measured  in  terms  of 
MPP,  of  a  personnel  classification  system.  The  concept  of  DAT  is  derived  from  the  integrative 
review  of  personnel  classification  literature,  with  special  emphasis  on  the  contributions  of 
Brogden  and  Horst,  combined  with  the  systematic  development  of  methodologies  for  improving 
classification  efficiency,  as  described  in  Johnson  and  Zeidner  (1990,  1991). 

Several  fundamental  concepts  form  the  basic  assumptions  for  DAT.  The  first  and  third 
(as  first  presented)  hold  that,  in  the  general  case,  there  is  a  complex  set  of  principles  defining 
separate  approaches  for  optimizing  either  the  selection  or  classification  procedures.  There  is  an 
exception  for  the  special  situation  where  FLS  composites  based  on  complete  information  are  used 
as  both  selection  and  assignment  variables  for  each  job  family,  each  job  family  consists  of  a 
single  job,  and  a  LP  algorithm  is  used  to  optimally  and  simultaneously  select  and  assign  entities 
to  jobs  (the  MDS  algorithm).  All  deviations  from  this  special  situation  where  the  same  test 
composites  and  job  families  are  optimal  for  both  selection  and  classification  requires  that  a 
decision  be  made  as  to  whether  it  is  desired  to  maximize  the  effectiveness  of  selection  or 
classification. 

The  second  concept  of  DAT  maintains  that  utility  models,  where  the  object  is  to 
maximize  the  benefits  less  the  costs,  provides  the  best  approach  for  evaluating  alternative 
policies  and  procedures  for  selection,  classification  or  placement,  and  assignment  of  personnel 
to  jobs. 

The  fourth  concept  argues  that  computer  technology  has  reached  the  state  where  it  is 
practical  to  implement  any  selection  and  assignment  strategy  and/or  algorithm  that  can  be  shown 
to  provide  a  useful  gain  in  MPP. 
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Additional  basic  concepts  of  DAT  could  be  derived  from  an  examination  of  the  examples 
of  DAT  principles  provided  by  Johnson  and  Zeidner  (1990,  1991).  For  example,  it  appears  that 
DAT  assumes  a  non-trivial  degree  of  multidimensionality  in  the  joint  predictor-criterion  (JP-C) 
space  despite  the  inevitable  presence  of  a  strong  general  cognitive  ability  factor,  "g",  and  the 
high  level  of  credibility  of  several  other  validity  generalization  concepts  often  associated  with 
the  traditional  general  factor  theorists.  It  should  be  emphasized  that  there  is  no  inconsistency 
between  DAT  and  validity  generalization  theory  as  originated  by  Mosier  (1951)  or  the  more 
recent  literature  on  validity  generalization  as  it  pertains  to  selection  efficiency  or  the  utility  of 
selection. 

DAT  provides  a  basis  for  optimism  in  that  it  argues  for  the  feasibility  of  designing, 
developing,  and  implementing  personnel  classification  and  assignment  systems  far  superior  to 
the  existing  operational  systems.  We  believe  that  most  operational  classification  systems 
developed  or  modified  since  1980  were  designed  and/or  evolved  in  the  pessimistic  belief  that  the 
dimensionality  in  the  JP-C  space  was  small  (even  to  the  point  of  unidimensionality)  and  that  the 
weights  utilized  in  FLS  composites  lacked  adequate  stability  to  permit  the  desiru  of  an  effective 
personnel  classification  system.  Thus,  the  optimistic  belief  that  efficient  personnel  classification 
is  attainable  is  also  an  attribute  of  DAT. 

Several  DAT  principles  are  supported  by  the  results  of  this  study.  The  principles 
confirmed  include:  (1)  the  effectiveness  of  using  FLS  composites  as  assignment  variables  (AVs) 
in  place  of  unit  weighted  composites  designed  to  maximize  predictive  validity;  (2)  the  increase 
in  MPP  as  the  number  of  job  families  is  increased;  and  (3)  the  further  increase  in  MPP  as 
improved  job  family  structure  raises  FLS  composite  validities  and  reduces  FLS  composite 
intercorrelations. 

The  findings  of  this  study  will  be  integrated  with  those  of  a  number  of  companion  studies 
to  further  refine  the  basic  concepts  and  principles  of  DAT.  For  example,  in  the  present  study 
we  did  not  anticipate  the  extent  to  which  validities  were  increased  by  a  job  clustering  algorithm 
designed  to  sustain  the  average  differential  validity  remaining  in  the  set  of  job  families  as  jobs 
were  agglutinated  into  the  desired  number  of  job  families.  We  were  similarly  surprised,  in  a 
companion  study  in  which  we  did  not  anticipate  the  smallness  of  the  gain  in  MPP  attributable 
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to  the  differing  validities  of  a  single  predictor  across  jobs.  Each  of  these  studies  has  both 
confirmed  and  enriched  DAT. 

2.  Implementation  of  DAT 

Several  DAT  principles  are  supported  by  the  results  of  this  study.  These  findings  point 
to  very  high  potential  benefits  obtainable  from  a  major  overhaul  of  the  Army’s  selection  and 
classification  system.  An  effective  redesign  of  this  system  should  start  with  the  acceptance  of 
the  maximization  of  MPP  as  an  over  arching  objective. 

The  design  of  an  effective  selection  and  classification  system  requires  the  consideration 
of  many  practical  issues  outside  the  scope  of  DAT.  Tradition  and  perceived  necessity  provide 
a  number  of  entrenched  solutions  to  operational  problems  concerned  with  matching  job 
preferences  of  recruits,  distribution  of  personnel  across  MOS  to  meet  quality  standards, 
providing  vocational  opportunities  for  females,  minorities  and  underprivileged  recruits,  and  the 
use  of  unit  (or  at  least  positive)  weights  for  the  tests  in  AVs.  We  believe  DAT  would 
considerably  impact  on  a  reconsideration  of  many  policy  issues.  Although  this  study  is  not 
directly  focused  on  these  issues,  we  recognize  that  the  full  implementation  of  the  results  of  this 
study  requires  the  reconsideration  of  these  policies. 

Zeidner  and  Johnson  (1989b,  1991b)  predicted  that  use  of  FLS  composites  instead  of  AAs 
could  increase  the  potential  MPP  obtainable  from  a  classification  system  by  100  percent.  The 
present  study  confirms  this  prediction  by  showing  that  the  gain  obtainable  from  optimizing  AVs 
is  133  percent  —  if  initial  assignments  could  be  accomplished  using  an  LP  algorithm  without 
consideration  of  individual  preferences.  We  still  lack  information  as  to  the  effect  that  a 
concerted  effort  to  persuade  recruits  to  accept  suitable  assignments  (i.e.,  assignments  to  jobs  in 
which  their  predicted  performance  is  relatively  high)  would  have  on  recruiting  costs.  Until 
recently  many  thought  that  very  little  utility  was  lost  through  permitting  preferences  (often  based 
on  no  information,  or  worse,  serious  misinformation  about  Army  jobs)  to  be  the  primary 
determinant  of  initial  assignments.  Evidence  in  this  and  related  studies  show  that  a  great  deal 
of  utility  is  lost  through  failure  to  make  more  use  of  optimal  assignment  information. 
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B.  Operational  Implications 

1.  Broad  Conclusions 

Our  data  provide  compelling  evidence  that  the  existing  operational  test  composites  could 
be  reconstituted  to  substantially  improve  classification  efficiency.  The  evidence  also  strongly 
supports  the  usefulness  of  the  existing  ASVAB  as  a  classification  tool.  The  percentage  gain 
obtainable  from  adding  all  20  of  the  Project  A  experimental  measures  to  the  existing  tests  of  the 
ASVAB  is  a  31  percent  increase  in  classification  efficiency.  A  299  percent  gain  over  the 
operational  job  families  and  composites  is  achieved  by  using  FLS  composites,  optimal 
assignment,  12  CE  job  families,  and  the  experimental  Project  A  measures. 

The  31  percent  gain  from  adding  new  Project  A  tests  is  substantial  when  compared  to  the 
gains  obtainable  from  making  changes,  one  at  a  time,  to  the  same  baseline  condition  used  to 
compute  this  gain.  If  we  use  FLS  composites  for  9  operational  job  families  as  our  baseline 
(MPP  =  .214),  we  show  a  gain  of  31  percent  by  only  adding  the  20  experimental  tests,  a  gain 
of  15  percent  by  only  changing  to  9  job  families  based  on  empirical  clustering,  and  a  gain  of  14 
percent  by  changing  to  12  job  families  based  on  empirical  clustering. 

A  number  of  general  conclusions  can  be  drawn  by  examining  these  comparative  gains. 
First,  we  see  a  higher  classification  efficiency  inherent  in  the  ASVAB  than  is  usually  posited. 
Second  the  failure  to  obtain  even  higher  differential  gains  from  the  addition  of  new  experimental 
variables  to  the  ASVAB  probably  reflects  the  relative  lack  of  emphasis  given  to  classification 
efficiency  by  test  development  researchers  over  the  past  two  decades.  Third,  the  total  gain  of 
299  percent  achieved  by  implementing  all  of  our  proposed  changes  in  the  operational  system, 
including  the  additional  differential  validity  provided  by  Project  A  experimental  tests,  reflects 
a  potential  that  cannot  presently  be  fully  realizable  in  an  operational  system  constrained  by 
current  policies,  but  definitely  points  to  a  route  that  should  eventually  lead  to  very  substantial 
gains  in  MPP. 

While  the  procedures  used  to  form  the  existing  operational  job  families  are  clearly  not 
optimal,  they  are  much  more  effective  than  are  the  corresponding  AA  composites.  Even  the 
Career  Management  Field  (CMF)  clusters  provide  considerable  improvement  in  classification 
efficiency  when  used  to  expand  the  number  of  job  families  to  which  assignment  is  accomplished. 
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It  would  appear  that  job  families  which  meet  other  administrative  and  training  requirements  apart 
from  personnel  classification,  such  as  CMF,  can  be  effectively  utilized  in  a  personnel 
classification  system. 

The  authors  of  the  principle  technical  report  on  Project  A  (McLaughlin  et  al.,  1984) 
concluded  that  an  empirical  job  clustering  process  was  inherently  ineffective  because  of  its 
dependence  on  presumed  unstable  regression  weights  used  to  form  assignment  variables  (A Vs). 
The  findings  of  the  present  study,  however,  show  that  even  when  sampling  error  is  allowed  to 
take  its  full  toll,  the  MPP  obtainable  in  independent  samples  is  greatly  improved  by  the  use  of 
empirical  clustering  of  jobs  into  families  and  the  representation  of  these  families  by  FLS 
composites. 

A  major  reconstitution  of  the  job  families  in  the  Army’s  classification  system  should  be 
based  on  all  available  validity  data,  as  well  as  on  information  available  from  job  analyses.  We 
do  not  wish  to  suggest  that  the  job  family  structure  decisions  should  be  based  on  the  limited  data 
utilized  in  the  present  methodological  study.  However,  we  are  confident  that  the  major 
conclusions  of  this  study  will  be  confirmed  as  additional  data  are  collected  and  analyzed  using 
simulations  to  obtain  MPP  values. 

2,  Policy  Issues 

A  number  of  policy  issues  pertaining  to  personnel  classification  and  assignment  must  be 
resolved  before  a  new  system  incorporating  DAT  concepts  and  principles  can  be  implemented. 
A  number  of  these  issues  are  noted  below. 

a.  The  "g"  Controversy.  Does  the  poor  classification  efficiency  available  from  the 
existing  operational  AAs  mean  that  the  Army  should,  as  some  validity  generalization 
proponents  contend,  change  to  a  system  which  uses  a  single  measure  of  cognitive  ability, 
plus  measures  of  psychomotor  ability  and  clerical  speed? 

b.  The  Feasibility  of  Implementing  LP  Algorithms.  Can  optimal  assignment 
algorithms  be  implemented  in  the  current  recruiting  market?  If  not,  can  cut  scores  be 
raised  in  such  a  way  as  to  provide  a  similar  level  of  MPP? 

c.  Using  FLS  Composites  as  AVs.  Can  FLS  composites  with  both  positive  and  negative 
weights  be  implemented?  If  not,  can  a  comparably  effective  two-tiered  strategy,  in 
which  the  second  tier  uses  composites  with  all  positive  weights,  be  implemented? 
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d.  The  Substitution  of  "g"  For  AFQT.  Can  a  general  FLS  composite  be  substituted  for 
AFQT  as  the  selection  instrument? 

e.  Quality  Distribution.  Can  quality  distribution  policies  be  altered  to  use  predicted 
performance  instead  of  AFQT  as  the  measure  of  personnel  quality? 

f.  Optimal  Simultaneous  Selection  And  Assignment.  Must  the  Army  continue  to  use 
a  two-stage  selection  and  classification  system  in  which  selection  and  classification  is 
accomplished  in  separate  successive  stages,  instead  of  the  more  effective  and  equitable 
single  stage  system  in  which  selection  and  classification  is  accomplished  simultaneously? 
Such  a  single  stage  algorithm  is  described  by  Johnson  and  Zeidner  (1990,  1991).  It  is 
called  the  multidimensional  screening  (MDS)  algorithm.  Future  plans  to  make  use  of 
MDS  should  affect  the  choice  of  a  job  clustering  algorithm  for  the  design  of  a  new 
system. 

g.  Assessing  Future  Requirements.  Can  the  quality  requirements  of  future  weapon 
systems  be  assessed  in  terms  of  FLS  composites  or  factor  composites  used  in  the  second 
tier  instead  of  through  the  use  of  AFQT? 

C.  Accomplishing  Operational  Changes 

l,  Options  for  the  CE  job  Clustering  Algorithm 

We  believe  our  CE  clustering  algorithm  and  the  FORTRAN  program  implementing  this 
algorithm  to  be  a  major  product  of  this  study.  The  flexibility  built  into  this  algorithm  for 
including  additional  options  adds  to  its  value.  There  are  a  number  of  options  which  we  believe 
would  add  to  the  usefulness  of  this  algorithm  for  making  changes  in  operational  classification 
systems. 

We  first  describe  an  earlier,  untested,  concept  that  had  a  number  of  features  relevant  to 
the  operational  use  of  a  job  clustering  algorithm.  We  then  describe  additional  features  we 
believe  have  practical  value  in  the  reconstitution  of  operational  job  families,  and  describe  how 
these  features  could  be  provided  as  options  to  our  CE  clustering  algorithm. 

A  classification-efficient  job  clustering  method  which  was  considered,  but  not  selected 
for  implementation  in  the  present  study,  has  two  stages.  Kernels,  consisting  of  one  or  more 
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jobs,  for  a  desired  number  of  job  families  are  selected  during  the  first  stage.  In  stage  2,  the 
remaining  jobs  are  then  distributed  to  one  of  these  kernels  (established  job  families)  in  such  a 
way  as  to  maximize  H^.  Since  MOS  selected  to  initially  define  a  family  (that  is,  the  kernel) 
remain  in  that  family  throughout  the  clustering  process  without  being  immersed  (agglutinated) 
into  any  other  family,  we  believe  this  second  stage  would  be  particularly  applicable  to  the 
refinement  of  operational  job  families. 

For  Design  a,  the  initial  6  (or  9,  or  12)  job  families  were  to  be  identified  as  the  set  of 
6  (or  9,  or  12)  MOS,  out  of  the  18  MOS  in  the  data,  which  as  a  set  of  job  families  (one  job  to 
a  family)  would  yield  the  highest  value  for  Hj.  Applying  this  concept  to  an  operational  situation 
the  initial  (kernel)  job  families  would  instead  be  provided  by  a  small  number  of  MOS  which, 
based  on  all  available  information,  appear  to  be  located  near  the  center  of  job  families  which 
span  the  joint  predictor-criterion  space  (with  as  much  distance  between  the  kernels  as  can  be 
obtained).  It  may  be  desirable  to  include  other  career  management  considerations  in  the  decision 
process  that  yields  the  set  of  kernels. 

The  adding  of  further  jobs  to  the  job  family  kernels  is  accomplished  sequentially,  with 
all  unassigned  jobs  on  which  adequate  data  is  available  being  considered  during  each  cycle  of 
the  algorithm.  Only  one  unassigned  job  is  selected  for  inclusion  in  a  job  family  during  each 
cycle.  The  job  making  the  greatest  overall  contribution  to  Hd  by  being  agglutinated  into  that 
family  becomes  a  member  of  that  family,  and  the  remaining  unassigned  jobs  then  become 
candidates  for  selection  in  the  succeeding  cycles. 

Our  CE  job  clustering  algorithm  would  have  more  use  in  the  redesign  of  operational  job 
families  if  the  algorithm  was  modified  to  provide  two  additional  options,  each  permitting  one 
or  more  alternative  approaches  to  the  forming  of  job  families.  The  first  of  these  options  is 
sufficient  to  accomplish  the  job  clustering  objectives  of  the  alternative  algorithm  (considered  but 
not  programmed  and  tested)  described  above.  The  two  options  together  permit  an  efficient  use 
of  our  CE  algorithm  for  clustering  jobs  into  a  set  of  families  in  which  no  families  lack  sufficient 
validity  data  to  provide  for  stable  regression  weights  in  the  corresponding  FLS  composites. 

Without  modifying  our  CE  algorithm  we  could  incorporate  a  designated  set  of  jobs  into 
prescribed  job  families  to  form  the  kernels  of  job  families  that  are  desired  for  administrative 
reasons  or,  based  on  prior  data  and  experience,  are  judged  to  have  similar  aptitude  requirements. 
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We  could  then  use  our  unaltered  CE  algorithm,  as  applied  in  this  study,  but  we  would  probably 
prefer  to  use  our  algorithm  with  one  or  both  of  the  proposed  options. 

Option  one  adds  the  capability  of  designating  some  jobs  or  job  families  as  ineligible  for 
agglutination  with  each  other.  These  designated  families  are  prevented,  by  option  one,  from 
losing  their  identity  through  being  agglutinated  with  each  other  or  with  families  containing  more 
than  a  stipulated  number  of  jobs. 

Option  two  requires  a  stipulated  sample  size  for  the  validity  data  associated  with  each 
family  resulting  from  the  agglutination  of  a  pair  of  jobs  and/or  job  families.  The  implementation 
of  this  option  in  our  algorithm  can  be  accomplished  during  the  examination  of  the  D  matrix  in 
each  cycle.  As  the  elements  in  each  D  matrix  are  examined  (searching  for  the  smallest  cell 
value)  for  selecting  the  pair  of  jobs  and/or  job  families  that  would  be  most  appropriate  for 
agglutination,  the  combined  N  associated  with  each  cell  of  D  is  tested  and  a  cell  is  not 
considered  for  selection  unless  the  combined  N  of  the  associated  pair  exceeds  the  prescribed  N. 

The  preferred  procedure  for  using  this  option  would  be  to  first,  as  stage  1,  obtain  a 
solution  (a  set  of  job  families)  using  the  unaltered  CE  clustering  algorithm.  An  F  matrix  in 
which  the  rows  represent  the  job  families  obtained  in  this  initial  solution,  with  those  families 
based  on  inadequately  sized  validity  samples  shredded  into  their  constituent  jobs,  is  then 
constructed.  In  stage  2,  this  revised  F  matrix  is  then  used  as  input  to  the  CE  clustering 
algorithm  with  both  the  first  and  second  options  activated.  The  succeeding  cycles  of  stage  2, 
in  accordance  with  option  1,  would  then  agglutinate  the  remaining  jobs  with  each  other  or  with 
the  job  families  retained  from  stage  1.  Since,  in  accordance  with  option  2,  only  the  individual 
jobs  and  the  specified  job  families  now  (after  stage  1)  have  large  enough  validity  samples  to  be 
eligible  for  agglutination,  the  jobs  cannot  be  agglutinated  with  each  o*her  and  are  instead 
distributed  among  the  job  families  identified  in  stage  1  in  such  a  way  as  to  maximize  Hj/m.  The 
end  result  would  be  a  set  of  job  families  in  which  all  the  job  families  have  adequate  validity  data 
to  permit  the  forming  of  stable  AVs,  while  using  the  proven  efficiency  of  our  CE  clustering 
algorithm  for  providing  a  high  value  of  H,. 

2,  Redesigning  the  Personnel  Classification  System 

Zeidner  and  Johnson  (1989b,  1991b)  proposed  a  sequence  of  changes  in  the  design  of 
operational  systems  for  the  selection,  classification  and  initial  assignment  of  new  personnel. 
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They  assumed  that  the  adoption  of  the  use  of  FLS  composites  as  A  Vs  was  the  source  of  the 
largest  potential  increase  in  MPP,  and  probably  the  easiest  to  incorporate  in  an  operational 
system.  The  present  study  shows  that  the  reconstitution  of  job  families  by  increasing  both  the 
quantity  and  quality  of  job  families  used  in  the  classification  process,  can  provide  a  comparable 
gain  in  MPP. 

The  results  of  this  study  support  the  conclusion  that  an  increase  in  the  number  of  job 
families  for  classification  purposes  would  be  economically  profitable  even  if  the  structure  of  the 
sets  of  families  has  been  created  for  some  other  purpose  such  as  career  ladder  management  or 
management  of  training.  The  immediate  increase  in  number  of  job  families,  in  conjunction  with 
improved  aptitude  area  composites  for  each  family,  could  greatly  increase  the  MPP  resulting 
from  the  classification  process.  This  expected  increase  is  so  great  that  we  are  less  certain  than 
we  once  were  that  the  shredding  out  of  the  nine  Army  job  families  should  await  the  adoption  of 
FLS  composites  as  AVs. 

The  four  methodological  studies  concerned  with  the  measurement  and  improvement  of 
classification  efficiency  suggested  in  Zeidner  and  Johnson  (1989),  plus  two  additional  ones  in 
process  of  being  initiated,  should  be  completed  by  the  GWU  research  team  prior  to  1993.  We 
fully  expect  that  DAT  will  be  extensively  expanded  and  refined  by  then  and  the  operational 
methodology  available  for  the  redesign  of  selection  and  classification  systems  will  also  be 
expanded,  validated,  better  described,  and  better  understood  than  at  present.  Research  now  in 
progress  on  related  topics  such  as  synthetic  validity  should  add  to  the  improvement  of  technology 
on  personnel  classification.  We  are  hopeful  that  we  will  see  the  start  of  a  new  era  in  which 
personnel  classification  receives  the  attention  it  deserves. 

We  believe  the  gains  in  MPP  afforded  by  the  CE  algorithm  for  the  formation  of  job 
families  evaluated  in  this  study  are  great  enough  to  justify  future  use  of  this  algorithm  as  a 
research  tool  when  further  validity  data  are  acquired.  We  also  believe  this  algorithm  should 
be  used  to  provide  one  source  of  information  to  be  combined  with  judgment  in  the 
formulation  of  operational  job  families. 

While  synthetic  validity  might  appropriately  be  used  in  the  context  of  predictive  validity 
for  designing  a  selection  system,  it  is  always  inappropriate  to  substitute  predictive  validity 
concepts  for  MPP  in  the  formulation  of  personnel  classification  systems.  Synthetic  validity 
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might  correctly  be  viewed  as  one  source  of  validity  information  required  to  implement  the  MPP 
focused  techniques  demonstrated  in  this  study. 

The  implementation  of  an  ideal  operational  classification  system  is  unlikely  to  be 
accomplished  in  a  single  step.  Traditions  relating  to  Army  classification  systems  and  the 
administrative  complexities  involved  in  implementing  changes  inhibit  making  one  overall  major 
change  in  the  operational  classification  system  incorporating  all  desired  improvements.  We 
believe  changes  are  most  likely  to  occur  in  a  number  of  separate  steps: 

1.  Substitute  FLS-ASVAB  composites  for  the  existing  AAs  as  the  AVs  for  the  existing 
9  job  families. 

2.  Increase  the  number  of  job  families  and  corresponding  AVs  using  judgment  to  adjust 
the  existing  CMF  boundaries. 

3.  Construct  improved  SQT-type  job  knowledge  measures  for  a  comprehensive  set  of 
MOS  and  obtain  scores  for  two  years  of  recruit  input  after  soldiers  have  been  on  the 
job  for  6  to  8  months. 

4.  Develop  classification-efficient  families  and  corresponding  AVs  to  substitute  for  the 
a  priori  job  families  (while  retaining  the  large  number  of  job  families). 

5.  Eliminate  use  of  job  families  and  instead  use  separate  AVs  for  each  job  in  the  initial 
assignment  process  (first  tier);  use  a  smaller  number  of  job  families,  one  for  each 
factor  score,  in  the  second  tier. 


Steps  1  and  2 

The  first  two  steps  should  be  accomplished  through  the  use  of  all  available  research  data 
for  the  computation  of  FLS-ASVAB  composites.  The  FLS  composites  should  be  used  to  provide 
both  AA  scores  for  inclusion  in  the  soldier’s  record  and  recommended  classification  to  job 
families  at  the  time  of  initial  assignment.  When  the  number  of  job  families  exceeds  20, 
installation  of  a  two-tiered  classification  system  should  be  considered.  The  first  tier  provides 
for  initial  assignment  and  makes  use  of  FLS  composites.  Composite  scores  for  initial  assignment 
are  computed  within  a  figurative  black  box  and  the  test  weights  are  invisible  to  the  examinees 
and  transparent  to  the  personnel  administration  staff.  The  second  tier  uses  a  smaller  number  of 
factor  composites,  covering  the  same  joint  predictor-criterion  space;  factor  scores  are  recorded 
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in  official  records  and  are  intended  for  subsequent  use  by  individuals  and  counselors  to  assist 
in  making  career  decisions. 

Step  3 

The  principal  obstacle  to  an  immediate  application  of  the  results  of  this  study  to  the 
redesign  of  an  operational  system  may  be  the  low  credibility  of  SQT  when  used  as  a  criterion 
variable  and  the  limited  number  of  MOS  covered  in  the  Project  A  concurrent  validation  study. 
To  the  extent  that  the  utility  obtainable  from  the  implementation  of  the  findings  of  the  Design 
B  components  of  this  study  are  dependent  on  the  credibility  of  SQT  as  a  personnel  research 
criterion  variable,  the  value  of  obtaining  validity  data  covering  an  equivalent  number  of  MOS, 
but  using  a  more  credible  technical  proficiency  measure,  would  be  well  worth  the  cost. 
Approaches  for  extrapolating  validity  information  from  a  relatively  small  set  of  jobs  to  the  entire 
set  of  MOS  in  the  Army,  such  as  the  use  of  synthetic  validity  methods,  also  require  validities 
for  a  large  number  of  MOS  before  a  credible  validation  of  the  approach  can  be  provided.  A 
major  finding  of  this  study  is  that  the  benefits  obtainable  from  such  a  collection  of  criterion 
scores  would  far  surpass  the  estimated  costs. 

Both  the  McLaughlin  et  al.  (1984)  study  and  the  present  study  show  that  there  is  a 
significant  content  difference  between  SQT  and  other  criterion  variables.  It  is  obvious  that  SQT 
does  not  measure  the  same  thing  as  either  school  grades  or  the  Core  Technical  Proficiency 
(CTP)  criterion  of  the  Project  A  concurrent  validation  study.  Unfortunately,  we  lack  a  more 
"ultimate"  criterion  that  could  be  used  to  establish  the  superiority  of  one  or  another  of  these 
three  criteria.  Judging  from  their  psychometric  properties  and  the  objectives  that  guided  their 
construction,  we  believe  that  most  measurement  specialists  would  readily  agree  that  the  CTP 
criterion  is  best  and  school  grades  the  worst  for  purposes  of  both  research  and  operational 
developments  of  the  type  described  here.  Traditional  SQT  items  are  intended  to  have  training 
diagnostic  applicability,  although  less  representative  of  the  job  and  otherwise  inappropriate  as 
a  predictor  of  on-the-job  performance.  These  items  are  also  frequently  intended  to  be  criterion 
referenced,  guaranteeing  poor  discriminability  among  those  who  are  at  least  minimally  qualified. 
However,  it  would  not  be  difficult  to  develop  additional  test  items  with  more  appropriate 
psychometric  properties  and  content.  These  additional  items  could  be  administered  at  the  same 
time  as  the  traditional  SQT  items.  It  should  be  possible  to  minimize  the  difference  between  the 


content  of  these  additional  items  and  that  of  the  CTP  measures  for  the  half  of  the  MOS  included 
in  the  concurrent  study  of  Project  A  that  lack  "hands  on"  items. 

We  emphasize  that  the  development  of  a  superior  classification  system  requires 
information  about  predictor  and  criterion  variables  for  a  large  number  of  representative  MOS. 
While  we  do  not  reject  the  usefulness  of  using  all  existing  data  in  conjunction  with  analytic  and 
judgmental  data  concerning  MOS  to  accomplish  an  interim  reconstitution  of  job  families  and 
A  Vs,  we  believe  any  resulting  system  eventually  should  be  validated  using  a  credible  CTP-type 
criterion  measure.  A  simulation  that  concludes  with  the  computation  of  MPP  of  individuals  or 
entities  after  optimal  assignment  to  jobs  would,  of  course,  be  required. 

Step  4 

We  recommend  the  use  of  the  techniques  of  this  study  for  designating  the  MOS  that  form 
job  families  and  for  identifying  corresponding  AVs,  to  form  a  first  tier  classification  system. 
This  can  be  done  when  it  is  felt  that  adequate  validity  information  exists.  We  believe  policy 
makers  should  seriously  consider  the  parallel  installation  of  a  second  tier  in  the  classification 
system  for  use  in  making  career  decisions.  Important  research  results  bearing  on  the  design  of 
a  two-tiered  system  will  be  available  prior  to  1992. 

The  results  of  this  study  indicate  that  the  classification-efficient  clustering  method 
developed  and  described  here  is  more  than  adequate,  and  is  without  doubt  the  best  of  those 
known  to  us  for  use  in  obtaining  the  job  families  for  the  first  tier  in  a  personnel  classification 
system.  However,  other  classification-  efficient  job  clustering  methods  also  have  potential  for 
use  in  future  efforts  to  reconstitute  job  families,  particularly  for  the  second  tier.  For  example, 
rotation  of  CE  factors  in  the  JP-C  space  to  simple  structure  may  provide  classification-efficient 
job  families  and  corresponding  AVs  useful  for  counseling  soldiers  and  the  setting  of  minimum 
standards  that  are  visible  to  all.  Such  a  factorial  approach  is  effective  for  producing  job  families 
that  are  no  more  numerous  than  twice  the  number  of  factors;  a  family  can  be  identified  by  each 
end  (positive  and  minus  ends)  of  each  factor.  Thus,  the  factorial  approach  is  not  a  good  source 
of  larger  numbers  of  job  families.  However,  a  smaller  set  of  job  families  may  serve  the  needs 
of  counselors  better  than  a  larger  set. 
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Zeidner  and  Johnson  (1989b,  1991b)  have  proposed  a  two-tiered  system  which  would  use 
a  large  number  of  job  families  for  initial  assignment  and  a  smaller  set  for  counseling  and  other 
administrative  needs.  The  study  exploring  this  issue,  one  that  is  in  progress  at  GWU,  will 
emphasize  a  factor  analytic  approach  in  which  classification-efficient  factors  rotated  to  simple 
structure  in  the  joint  predictor-criterion  space  will  provide  factor  scores  for  use  as  AVs. 

Step  5 

DAT  predicts  that  an  optimal  classification  system  can  be  obtained  when  a  separate  AV 
is  used  for  each  MOS  that  has  some  minimum  amount  of  validity  information.  We  do  not  know 
what  validity  sample  size  is  required  to  provide  this  minimum.  It  seems  reasonable  that  when 
only  very  small  samples  are  available,  judgment  may  be  more  important  than  empirical  data  for 
use  in  both  job  clustering  and  the  definition  of  an  AV  for  each  cluster.  When  samples  are  a 
little  larger,  but  still  small,  the  validity  information  provided  by  similar  MOS  might  be  combined 
to  provide  an  alternative  to  sheer  judgment. 

The  use  of  a  small  number  of  broadly  defined  families  cannot  fail  to  have  at  least  as 
many  MOS  closer  to  one  or  more  of  the  family  boundaries  than  to  the  center  of  the  job  cluster 
where  the  FLS  composite  for  that  family  is  most  representative.  There  is  no  theoretical  basis 
for  believing  that  the  expected  distribution  of  the  points  representing  MOS  in  a  multidimensional 
joint  predictor-criterion  (JP-C)  space  is  anything  other  than  rectangular  (evenly  distributed). 
Assuming  such  a  distribution  of  points  in  a  JP-C  space  with  6  dimensions  (a  6  space),  even  a 
family  located  in  a  multidimensional  comer  would  still  have  a  greater  probability  of  being  closer 
to  one  or  more  of  the  hyperplanes  separating  it  from  the  other  families  than  to  the  center  of  its 
own  family.  Families  located  nearer  to  the  center  of  the  space  will  have  an  even  greater 
probability  of  being  located  on,  or  near,  a  boundary.  An  MOS  located  near  a  boundary  between 
two  families  cannot  be  expected  to  have  classification  efficiency  with  respect  to  that  pair  of  jobs. 

A  promising  alternative  approach  to  placing  MOS  into  broad  job  families  calls  for  the 
computation  of  separate  FLS  composites  for  each  job.  Instead  of  clustering  jobs  into  fixed 
families,  a  separate  cluster  is  formed  around  each  MOS.  Each  such  job  cluster  centered  on  a 
selected  MOS  would  have  adequate  validity  data  to  provide  stable  FLS  composites  for  the  job 
which  forms  the  nucleus  of  each  such  cluster.  The  investigation  of  such  an  approach  would  be 
an  appropriate  next  step  in  continued  research  on  the  reconstitution  of  Army  MOS  into  job 
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families.  The  model  sampling  techniques  and  computer  software  utilized  in  this  study,  both  of 
which  are  now  in  the  public  domain,  would  require  only  minor  modifications  to  permit  their  use 
in  the  execution  of  such  a  study. 

The  optimal  use  of  synthetic  validity  techniques  also  provides  a  promising  means  of  using 
available  validity  data  to  establish  FLS  composites  for  all  but  the  least  populated  MOS.  The  job 
families  and  A  Vs  resulting  from  the  use  of  synthetic  validities  should  be  evaluated  in  terms  of 
MPP  computed  in  the  context  of  system  simulations  of  the  type  used  in  this  study. 
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APPENDIX  A:  JOB  SAMPLE 


TABLE  A-l:  Military  Occupational  Specialty  (MOS)  Sample  Sizes 
for  Both  the  Project  A  and  "McLaughlin"  Data  Sets 


MOS 

Name 

Project  A 
n 

McLaughlin 

n 

11B 

Infantryman 

491 

6355 

12B 

Combat  Engineer 

544 

3109 

13B 

Cannon  Crewmember 

464 

6575 

16S 

MANPADS  Crewmember 

338 

596 

19E 

M48-M60  Armor  Crewmember 

394 

3297 

27E 

TOW/ DRAGON  Repairer 

123 

363 

3 1C 

Single  Channel  Radio  Operator* 

289 

2393 

54E 

NBC  Specialist 

340 

113 

55B 

Ammunition  Specialist 

203 

288 

63B 

Light  Wheel  Vehicle  Mechanic 

478 

1495 

64C 

Motor  Transport  Operator 

507 

3681 

67N 

Utility  Helicopter  Repairer 

238 

511 

71L 

Administrative  Specialist 

427 

2824 

76W 

Petroleum  Supply  Specialist 

339 

664 

76Y 

Unit  Supply  Specialist 

444 

1149 

91A 

Medical  Specialist 

392 

783 

94B 

Food  Service  Specialist 

368 

3943 

95B 

Military  Police 

597 

4516 

Average  Sample  Size 

388 

2370 

*MOS  3 1C  was  designated  as  05C  Radio  Teletype  Operator  in  the 
McLaughlin  1981/1982  data  set. 
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TABLE  A-2:  Military  Occupational  Specialty  (MOS)  Sample  Sizes 
for  all  60  Jobs  in  "McLaughlin"  Data  Set 


HOS 

n 

MOS 

n 

*05C  Radio  TT  Operator 

2393 

73C  Finance  Specialist 

688 

05N  Elec  War/SIGINT  INTER_IMC 

171 

74D  Computer/Tape  Writer 

132 

*118  Infantryman 

6355 

74F  Programmer/Analyst 

95 

11C  Indirect  Fire  Infmn 

1494 

758  Personnel  Adnin  Sp 

1061 

1 1 H  HV  Anti-Armor  Wpn  Infn 

979 

76C  Eq  Rec  8  Parts  Sp 

331 

•128  Confcat  Engineer 

3109 

76V  Mat  Stor  l  Hdlg  Sp 

216 

12C  Bridge  Creteaan 

450 

*76W  Petroleun  Supply  Sp 

664 

12F  Engineer  Tracked  Crmn 

151 

•76Y  Unit  Supply  Sp 

1149 

*138  Cannon  Crmn  (TK4) 

6575 

82C  Field  Artillery  Surveyor 

434 

13E  Cannon  Fire  Direction  Sp 

627 

*91B  Medical  Specialist 

783 

13F  Fire  Support  Sp 

693 

91E  Dental  Specialist 

203 

15D  Lance  Crmb/MLRS  Sgt 

281 

91P  X-Ray  Specialist 

159 

*16S  HANPADS  Crewmember 

596 

92B  Medical  Lab  Sp 

310 

190  Cavalry  Scout 

1249 

93H  Air  Traffic  Con  Tower  Op 

114 

•19E  M48-M60  Armor  Crmn 

3297 

*948  Food  Service  Sp 

3943 

*27E  TOW/Dragon  Rep 

363 

*958  Military  Police 

4516 

27F  Vulcan  Repairer 

130 

968  Intelligence  Analyst 

218 

31M  Multichannel  Conn  Eq  Op 

2482 

98C  Elec  War/SIGINT  Analyst 

186 

31N  Tactical  Ckt  Con 

189 

31V  Tac  Comm  Sysop/Mech 

515 

*  *  Design  A  MOS 

36C  Wire  Sys  Inst/Op 

499 

Average 

1002 

43E  Parachute  Rigger 

100 

52D  Power  Generation  Equip  Rep 

178 

*54E  NBC  Specialist 

113 

*558  Annum t ion  Sp 

288 

57H  Cargo  Specialist 

272 

628  Construction  Equip  Rep 

233 

62E  HV  Const  Equip  Rep 

202 

62F  Lifting/Loading  Eq  Op 

129 

*638  Lt  Uh  Veh/Pwr  Gen  Mech 

1495 

63H  Track  Veh  Repairer 

335 

63N  H60A1/A3  Tank  Sys  Mech 

286 

63W  Wheel  Veh  Mechanic 

180 

*64C  Motor  Transport  Op 

3681 

*67N  Utility  Hel  Repairer 

511 

67V  OBN/Scout  Hel  Rep 

294 

68G  Aircraft  Structural  Rep 

125 

68J  Aircraft  FC  Repairer 

148 

*71L  Administrative  Sp 

2824 

71M  Chapel  Activities  Sp 

182 

71N  Traffic  Mgmt  Coordinator 

163 

72E  Combat  Telecom  Center  Op 

569 

(continued) 
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APPENDIX  B:  PREDICTOR  MEASURES 

TABLE  B-l:  ASVAB/Pro j ect  A  Experimental  Predictors  and 
Reliabilites 


Code 

Predictors 

Reliability® 

ASVAB  tests'5 

GS 

General  Science 

0.86 

AR 

Arithmetic  Reasoning 

0.91 

NO 

Numerical  Operations 

0.78 

CS 

Coding  Speed 

0.85 

AS 

Auto  Shop  Information 

0.87 

MK 

Mathematical  Knowledge 

0.87 

MC 

Mechanical  Comprehension 

0.85 

El 

Electronics  Information 

0.82 

PC 

Paragraph  Comprehension 

0.81 

WK 

Word  Knowledge 

Paper-and-pencil  spatial  composite0 

0.92 

SPAT 

Spatial  Composite 

Perceptual -ps vchomotor  comoe s itesd 

0.71 

CPAC 

Complex  perceptual  accuracy  composite 

0.62 

CPSP 

Complex  perceptual  speed  composite 

0.95 

NMSA 

Number  speed  and  accuracy  composite 

0.84 

PSYM 

Psychomotor  composite 

0.82 

SRAC 

Simple  reaction  accuracy  composite 

0.52 

SRSP 

Simple  reaction  speed  composite 

Job  orientation  composites  (JOB!e 

0.88 

AUTO 

Autonomy  composite 

0.50 

SUPP 

Organizational  and  Co-Worker  Support 

0.67 

ROUT 

Routine  composite 

0.46 

Temperament  and  biodata  composites  (ABLE) T 

ADJU 

Adjustment  composite 

0.74 

DEPN 

Dependability  composite 

0.76 

COND 

Physical  condition  composite 

0.85 

SURG 

Achievement  orientation  composite 
Interest  composites  (AVOICE)f 

0.78 

AUDI 

Audiovisual  interest  composite 

0.74 

COMB 

Combat  interest  composite 

0.78 

FSER 

Food  service  interest  composite 

0.67 

PSER 

Protective  service  interest  composite 

0.76 

TECH 

Technical  interest  composite 

0.75 

MACH 

Machinery  interest  composite 

0.79 

*ASVAB  reliabilities  reported  in  McLaughlin,  et  al.  (1984), 
p.9;  Project  A  reliabilities  reported  in  Campbell  (1988). 
bTests  Pc  and  WK  are  combined  to  form  VE,  a  more  general 
verbal  ability  test. 

*Test-retest  reliability  (N=468  to  487) 
dOdd-even  reliability 

'internal  consistency  reliability  (alpha) 
fTest-retest  reliabilities  (N=368  to  412) 
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TABLE  B-2 :  Army  AFQT  Composite  and  Aptitude  Area  Composites 


Code  Composite 


ASVAB  Test  Formula 


AFQT  Armed  Forces  Qualification  Test 

CL  Clerical 

CO  Combat 

EL  Electronics  Repair 

FA  Field  Artillery 

GM  General  Maintenance 

MM  Mechanical  Maintenance 

OF  Operators/ Food 

SC  Surveillance  and  Communications 

ST  Skilled  Technical 


AR  +  NO/2  +  VE 
VE  +  NO  +  CS 
CS  +  AR  +  MC  +  AS 
AR  +  MK  +  El  +  GS 
CS  +  AR  +  MC  +  MK 
MK  +  El  +  GS  + 

NO  +  El  +  MC  + 

NO  +  VE  +  MC  + 

NO  +  CS  +  VE  + 

VE  +  MK  +  MC  +  GS 


APPENDIX  C:  RELIABILITIES  FOR  ''MCLAUGHLIN”  DATA  SET 


TABLE  C-l:  Cronbach  Alpha  Reliability  Estimates  Across  Three 
Years  for  the  SQT  Criterion 


MOS 

n 

:Reli ability  1987 
Alpha  n 

Reliability 

Alpha 

1988 

n 

Reliability 

Alpha 

1989 

n 

*05C  Radio  TT  Operator 

2393 

0.86 

5660 

0.82 

4827 

0.75 

4504 

OSH  Elec  Uar/SIGINT  INTER_IMC 

171 

0.83 

591 

0.77 

461 

0.67 

450 

*11B  Infantryman 

6355 

0.88 

22332 

0.88  22329 

0.86  23596 

11C  Indirect  Fire  Infmn 

1494 

0.89 

3138 

0.88 

3679 

0.88 

4805 

11H  HV  Anti-Armor  Wpn  Infn 

979 

0.87 

1237 

0.86 

2331 

0.78 

2627 

*128  Combat  Engineer 

3109 

0.81 

6944 

0.85 

6733 

0.86 

6511 

12C  Bridge  Crewman 

450 

0.75 

974 

0.82 

1083 

0.79 

1109 

12F  Engineer  Tracked  Crnri 

151 

0.89 

299 

0.86 

429 

0.87 

376 

*13B  Cannon  Crmn  (TK4) 

6575 

0.82 

5163 

0.83 

5812 

0.83 

6944 

13E  Cannon  Fire  Direction  Sp 

627 

0.87 

833 

0.86 

1431 

0.89 

1234 

13F  Fire  Support  Sp 

693 

0.78 

1253 

0.86 

1275 

0.84 

1339 

150  Lance  Crmb/MLRS  Sgt 

281 

0.90 

945 

*16S  MANPAOS  Crewmember 

596 

0.76 

1231 

0.77 

1467 

190  Cavalry  Scout 

1249 

0.87 

2555 

0.87 

1947 

0.87 

1233 

*19E  M48-M60  Armor  Crmn 

3297 

0.87 

1121 

0.87 

2952 

0.79 

83 

*27E  TOU/Oragon  Rep 

363 

0.70 

194 

0.74 

279 

0.88 

225 

27F  Vulcan  Repairer 

130 

0.73 

98 

0.84 

106 

0.90 

131 

31M  Multichannel  Comm  Eq  Op 

2482 

0.86 

3440 

0.86 

4543 

0.76 

4215 

31N  Tactical  Ckt  Con 

189 

0.88 

196 

0.86 

196 

0.81 

336 

31V  Tec  Coma  Sysop/Hech 

515 

0.86 

2602 

0.80 

1519 

0.69 

1954 

36C  Wire  Sys  Inst/Op 

499 

0.88 

524 

43E  Parachute  Rigger 

100 

0.91 

568 

0.86 

571 

0.89 

1008 

520  Power  Generation  Equip  Rep  178 

0.85 

3157 

0.85 

2260 

*54E  NBC  Specialist 

113 

0.79 

1490 

*558  Ammunition  Sp 

288 

0.86 

1486 

0.88 

1582 

0.86 

1623 

57H  Cargo  Specialist 

272 

0.75 

851 

628  Construction  Equip  Rep 

233 

0.84 

1385 

0.90 

1959 

0.88 

1950 

62E  HV  Const  Equip  Rep 

202 

0.84 

1166 

0.79 

1353 

0.84 

1503 

62F  Lifting/Loading  Eq  Op 

129 

0.82 

457 

0.83 

424 

0.80 

426 

*638  Lt  Wh  Veh/Pwr  Gen  Mech 

1495 

0.81 

8184 

0.83 

4559 

0.88 

7863 

63H  Track  Veh  Repairer 

335 

0.88 

1535 

r  « 4 

1651 

0.87 

1537 

63N  M60A1/A3  Tank  Sys  Mech 

286 

0.79 

743 

250 

0.79 

255 

63V  Wheel  Veh  Mechanic 

180 

0.87 

2434 

0.86 

2404 

0.85 

2501 

*64C  Motor  Transport  Op 

3681 

0.87 

10359 

*67N  Utility  Hel  Repairer 

511 

0.86 

1122 

0.76 

758 

0.77 

1056 

67V  OBN/Scout  Hel  Rep 

294 

0.80 

1031 

0.91 

1085 

0.78 

1130 

68G  Aircraft  Structural  Rep 

125 

0.86 

681 

0.83 

510 

0.89 

468 

68J  Aircraft  FC  Repairer 

148 

0.89 

267 

0.88 

294 

0.91 

499 

*711  Administrative  Sp 

2824 

0.82 

7576 

0.85 

7940 

0.85 

6928 

71m  Chapel  Activities  Sp 

182 

0.82 

785 

0.87 

854 

0.84 

749 

71N  Traffic  Mgmt  Coordinator 

163 

0.82 

982 

0.67 

972 

72E  Combat  Telecom  Center  Op 

569 

0.90 

2107 

0.86 

1555 

0.82 

1914 
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(continued) 


:Reliability  1987  : 

Reliability 

1988 

Reliability 

1989 

NOS 

n 

Alpha 

n 

Alpha 

n 

Alpha 

n 

73C  Finance  Specialist 

688 

0.82 

1601 

0.80 

1704 

0.79 

1695 

74 0  Computer/Tape  Writer 

132 

0.77 

750 

74F  Progranmer/Analyst 

95 

0.75 

419 

75B  Personnel  Admin  Sp 

1061 

0.81 

1956 

0.80 

2548 

0.72 

2604 

76C  Eq  Rec  l  Parts  Sp 

331 

0.75 

3500 

0.81 

3957 

0.75 

3497 

76V  Nat  Stor  t  Hdlfl  Sp 

216 

0.78 

3591 

0.73 

3964 

0.70 

3490 

*76W  Petroleun  Supply  Sp 

664 

0.85 

4574 

0.78 

5052 

*76Y  Unit  Supply  Sp 

1149 

0.87 

6917 

0.89 

7629 

0.85 

6903 

82C  Field  Artillery  Surveyor 

434 

0.85 

762 

0.83 

498 

0.82 

684 

*91B  Nedical  Specialist 

783 

0.85 

8172 

0.81 

10363 

0.79 

10499 

91E  Dental  Specialist 

203 

0.86 

831 

0.86 

969 

0.83 

1075 

91P  X-Ray  Specialist 

159 

0.86 

380 

0.91 

522 

0.83 

729 

92B  Nedical  Lab  Sp 

310 

0.88 

837 

0.91 

1067 

0.87 

1081 

93H  Air  Traffic  Con  Tower  Op 

114 

0.87 

208 

*948  Food  Service  Sp 

3943 

0.82 

7131 

0.82 

8422 

0.77 

7607 

*958  Nilitary  Police 

4516 

0.81 

9250 

0.80 

9218 

0.80 

10980 

968  Intelligence  Analyst 

218 

0.75 

418 

0.74 

429 

0.72 

563 

98C  Elec  War/SIGINT  Analyst 

186 

0.69 

481 

0.70 

161 

0.75 

3  77 

*  =  Design  A  NOS 

Average 

1002 

0.83 

2688 

0.83 

2893 

0.81 

3018 
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APPENDIX  D:  POPULATION  DATA 


TABL D-l:  1980  Youth  Population  ASVAB  Intcrcorrelations 

(see  Appendix  B  for  code  names) 


GS 

AR 

NO 

CS 

AS 

MK 

MC 

El 

VE 

GS 

1.00 

.72 

.52 

.45 

.64 

.69 

.70 

.76 

.80 

AR 

.72 

1.00 

.63 

.51 

.53 

.83 

.69 

.66 

.73 

NO 

.52 

.63 

1.00 

.70 

.30 

.62 

.40 

.41 

.62 

CS 

.45 

.51 

.70 

1.00 

.22 

.52 

.34 

.34 

.57 

AS 

.64 

.53 

.30 

.22 

1.00 

.41 

.74 

.75 

.52 

MK 

.69 

.83 

.62 

.52 

.41 

1.00 

.60 

.59 

.70 

MC 

.70 

.69 

.40 

.34 

.74 

.60 

1.00 

.74 

.60 

El 

.76 

.66 

.41 

.34 

.75 

.59 

.74 

1.00 

.67 

VE 

.80 

.73 

.62 

.57 

.52 

.70 

.60 

.67 

1.00 

Source:  Personal  Communication  from  Lawrence  M.  Hanser,  ARI 

Chief,  Selection  and  Classification  Tech.  Area  to 
Jesse  Orlansky,  Institute  for  Defense  Analyses, 

13  July,  1988 

TABLE  D-2:  Population  Predictor  Intercorrelations 
(see  Appendix  B  for  code  names) 

PREDICTORS  1-12 


GS 

AR 

NO 

CS 

AS 

MK 

MC 

El 

VE 

SPAT 

CPAC 

CPSP 

GS 

1.0000 

0.7200 

0.5200 

0.4500 

0.6400 

0.6900 

0.7000 

0.7600 

0.8000 

0.6707 

0.3166 

0.3170 

AR 

0.7200 

1 .0000 

0.6300 

0.5100 

0.5300 

0.8300 

0.6900 

0.6600 

0.7300 

0.7301 

0.3560 

0.2876 

NO 

0.5200 

0.6300 

1.0000 

0.7000 

0.3000 

0.6200 

0.4000 

0.4100 

0.6200 

0.5162 

0.3047 

0.3119 

CS 

0.4500 

0.5100 

0.7000 

1.0000 

0.2200 

0.5200 

0.3400 

0.3400 

0.5700 

0.4877 

0.3155 

0.2953 

AS 

0.6400 

0.5300 

0.3000 

0.2200 

1.0000 

0.4100 

0.7400 

0.7500 

0.5200 

0.5677 

0.2084 

0.2427 

MK 

0.6900 

0.8300 

0.6200 

0.5200 

0.4100 

1.0000 

0.6000 

0.5900 

0.7000 

0.6802 

0.3485 

0.2811 

MC 

0.7000 

0.6900 

0.4000 

0.3400 

0.7400 

0.6000 

1.0000 

0.7400 

0.6000 

0.7413 

0.2775 

0.3005 

El 

0.7600 

0.6600 

0.4100 

0.3400 

0.7500 

0.5900 

0.7400 

1.0000 

0.6700 

0.6159 

0.2749 

0.2617 

VE 

0.8000 

0.7300 

0.6200 

0.5700 

0.5200 

0.7000 

0.6000 

0.6700 

1.0000 

0.6234 

0.3678 

0.2783 

SPAT 

0.6707 

0.7301 

0.5162 

0.4877 

0.5677 

0.6802 

0.7413 

0.6159 

0.6234 

1.0000 

0.3886 

0.4057 

CPAC 

0.3166 

0.3560 

0.3047 

0.3155 

0.2084 

0.3485 

0.2775 

0.2749 

0.3678 

0.3886 

1.0000 

-0.2025 

CPSP 

0.3170 

0.2876 

0.3119 

0.2953 

0.2427 

0.2811 

0.3005 

0.2617 

0.2783 

0.4057 

-0.2025 

1.0000 

NMSA 

0.5895 

0.7156 

0.6966 

0.5545 

0.3938 

0.6774 

0.4914 

0.4996 

0.6498 

0.6143 

0.3000 

0.4129 

PSYM 

0.4459 

0.4383 

0.3249 

0.2920 

0.4586 

0.3841 

0.5479 

0.4544 

0.3773 

0.6040 

0.2477 

0.3768 

SRAC 

0.2136 

0.2179 

0.1653 

0.1703 

0.1901 

0.1861 

0.2100 

0.2023 

0.2312 

0.2311 

0.2284 

0.0642 

SRSP 

0.2169 

0.2283 

0.2646 

0.2534 

0.1385 

0.2146 

0.1892 

0.1766 

0.2288 

0.2569 

0.0695 

0.3716 

AUTO 

0.2486 

0.2261 

0.1849 

0.1562 

0.2227 

0.1953 

0.2203 

0.2393 

0.2602 

0.2039 

0.0566 

0.1009 

SUPP 

0.1383 

0.1196 

0.1739 

0.1745 

0.0436 

0.1294 

0.0584 

0.0938 

0.2090 

0.0978 

0.0886 

0.0516 

ROUT 

-0.3150 

-0.3021 

-0.2525 

-0.2355 

-0.2507 

-0.2620 

-0.2898 

-0.2737 

-0.3420 

-0.2974 

-0.1429 

-0.1408 

ADJU 

0.2256 

0.2399 

0.1925 

0.1338 

0.2048 

0.2147 

0.2261 

0.2259 

0.2315 

0.2258 

0.1227 

0.1186 

0EPN 

0.0522 

0.1017 

0.1350 

0.1520 

-0.0384 

0.1450 

0.0162 

0.0330 

0.0889 

0.0561 

0.1070 

-0.0034 

COND 

-0.0462 

-0.0322 

-0.0048 

-0.0387 

-0.0147 

-0.0269 

-0.0123 

-0.0348 

-0.0556 

-0.0352 

-0.0547 

0.0688 

SURG 

0.2076 

0.2533 

0.2371 

0.2020 

0.1593 

0.2393 

0.1903 

0.2003 

0.2392 

0.2023 

0.1246 

0.0997 

AUDI 

0.0147 

0.0022 

0.0058 

0.0221 

-0.0909 

0.0482 

-0.0184 

-0.0171 

0.0507 

0.0032 

0.0199 

0.0052 

COMB 

0.1539 

0.0660 

-0.0309 

-0.0663 

0.3433 

0.0120 

0.2594 

0.2220 

0.0435 

0.1737 

0.0135 

0.0728 

FSER 

-0.2097 

-0.1852 

-0.1295 

-0.1199 

-0.2366 

-0.1408 

-0.2317 

-0.2179 

-0.1939 

-0.2148 

-0.0924 

-0.1278 

PSES 

-0.0990 

-0.1426 

-0.1365 

-0.1275 

0.0101 

-0.1601 

-0.0577 

-0.0580 

-0.1356 

-0.0907 

-0.0818 

0.0008 

TECH 

-0.0039 

0.0629 

0.1116 

0.0783 

-0.1353 

0.1275 

-0.0575 

-0.0342 

0.0483 

-0.0134 

0.0601 

-0.0156 

MACH 

-0.1545 

-0.1908 

-0.2822 

-0.2951 

0.1864 

-0.2210 

0.0465 

0.0075 

-0.2955 

-0.0620 

-0.1171 

-0.0538 
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TABLE  D-2  (CONT.):  Population  Predictor  Intercorrelations 

PREDICTORS  13-24 


NMSA 

PSYM 

SRAC 

SRSP 

AUTO 

SUPP  ROUT 

ADJU 

DEPN  COND 

SURG 

AUDI 

GS 

0.5895 

0.4459 

0.2136 

0.2169 

0.2486 

0  1383  -0.3150 

0.2256 

0.0522  -0.0462 

0.2076 

0.0147 

AR 

0.7156 

0.4383 

0.2179 

0.2283 

0.2261 

0.1196  -0.3021 

0.2399 

0.1017  -0.0322 

0.2533 

0.0022 

NO 

0.6966 

0.3249 

0.1653 

0.2646 

0.1849 

0.1739  -0.2525 

0.1925 

0.1350  -0.0048 

0.23 71 

0.0058 

CS 

0.5545 

0.2920 

0.1703 

0.2534 

0.1562 

0.1745  -0.2355 

0.1338 

0.1520  -0.0387 

0.2020 

0.0221 

AS 

0.3938 

0.4586 

0.1901 

0.1385 

0.2227 

0.0436  -0.2507 

0.2048 

-0.0384  -0.0147 

0.1593 

-0.0909 

MK 

0.6774 

0.3841 

0.1861 

0.2146 

0.1953 

0.1294  -0.2620 

0.2147 

0.1450  -0.0269 

0.2393 

0.0482 

HC 

0.4914 

0.5479 

0.2100 

0.1892 

0.2203 

0.0584  -0.2898 

0.2261 

0.0162  -0.0123 

0.1903 

-0.0184 

El 

0.4996 

0.4544 

0.2023 

0.1766 

0.2393 

0.0938  -0.2737 

0.2259 

0.0330  -0.0348 

0.2003 

-0.0171 

VE 

0.6498 

0.3773 

0.2312 

0.2288 

0.2602 

0.2090  -0.3420 

0.2315 

0.0889  -0.0556 

0.2392 

0.0507 

SPAT 

0.6143 

0.6040 

0.2311 

0.2569 

0.2039 

0.0978  -0.2974 

0.2258 

0.0561  -0.0352 

0.2023 

0.0032 

CPAC 

0.3000 

0.2477 

0.2284 

0.0695 

0.0566 

0.0886  -0.1429 

0.1227 

0.1070  -0.0547 

0.1246 

0.0199 

CPSP 

0.4129 

0.3768 

0.0642 

0.3716 

0.1009 

0.0516  -0.1408 

0.1186 

•0.0034  0.0688 

0.0997 

0.0052 

NMSA 

1.0000 

0.4413 

0.1983 

0.3023 

0.1890 

0.1497  -0.2577 

0.2093 

0.0990  0.0062 

0.2301 

-0.0293 

PSYM 

0.4413 

1.0000 

0.1434 

0.2696 

0.1371 

0.0528  -0.2091 

0.1934 

-0.0230  0.0990 

0.1352 

-0.0224 

SRAC 

0.1983 

0.1434 

1.0000 

0.1200 

0.0368 

0.0561  -0.0919 

0.0694 

0.0245  -0.0456 

0.0498 

-0.0241 

SRSP 

0.3023 

0.2696 

0.1200 

1.0000 

0.0681 

0.0635  -0.1225 

0.1149 

0.0250  0.0477 

0.0991 

-0.0152 

AUTO 

0.1890 

0.1371 

0.0368 

0.0681 

1.0000 

0.2877  -0.1530 

0.1069 

0.0051  0.0531 

0.2010 

0.1033 

SUPP 

0.1497 

0.0528 

0.0561 

0.0635 

0.2877 

1.0000  -0.2384 

0.1163 

0.2542  0.0584 

0.3358 

0.1682 

ROUT 

-0.2577 

-0.2091 

-0.0919 

-0.1225 

-0.1530 

-0.2384  1.0000 

-0.1912 

-0.0363  -0.0653 

-0.2435 

-0.0059 

ADJU 

0.2093 

0.1934 

0.0694 

0.1149 

0.1069 

0.1163  -0.1912 

1.0000 

0.3414  0.2268 

0.6038 

0.0622 

DEPN 

0.0990 

-0.0230 

0.0245 

0.0250 

0.0051 

0.2542  -0.0363 

0.3414 

1.0000  0.1279 

0.5971 

0.1924 

COND 

0.0062 

0.0990 

-0.0456 

0.0477 

0.0531 

0.0584  -0.0653 

0.2268 

0.1279  1.0000 

0.3410 

0.0622 

SURG 

0.2301 

0.1352 

0.0498 

0.0991 

0.2010 

0.3358  -0.2435 

0.6038 

0.5971  0.3410 

1.0000 

0.1838 

AUDI 

-0.0293 

-0.0224 

-0.0241 

-0.0152 

0.1033 

0.1682  -0.0059 

0.0622 

0.1924  0.0622 

0.1838 

1.0000 

COMB 

0.0276 

0.2522 

0.0092 

0.0196 

0.1373 

0.0294  -0.0808 

0.1666 

-0.0298  0.1537 

0.1868 

0.1781 

FSER 

-0.1650 

-0.2282 

-0.0718 

-0.1014 

-0.0944 

-0.0527  0.2245 

-0.0707 

0.0489  -0.0341 

-0.0412 

0.3074 

PSER 

-0.1145 

0.0214 

-0.0293 

-0.0071 

0.0002 

0.0635  0.0482 

0.0392 

0.0340  0.1304 

0.0790 

0.1378 

TECH 

0.0688 

-0.0316 

-0.0292 

0.0116 

0.0684 

0.2415  0.0084 

0.1489 

0.3069  0.0869 

0.2955 

0.6719 

MACH 

-0.2163 

0.0616 

-0.0653 

-0.0809 

0.0138 

-0.0697  0.1119 

0.0014 

-0.1022  0.1296 

0.0076 

0.2014 

PREDICTORS  25-29 


COMB 

FSER 

PSER 

TECH 

MACH 

GS 

0.1539 

-0.2097 

-0.0990 

-0.0039 

-0.1545 

AR 

0.0660 

-0.1852 

-0.1426 

0.0629 

-0.1908 

NO 

-0.0309 

-0.1295 

-0.1365 

0.1116 

-0.2822 

CS 

-0.0663 

-0.1199 

-0.1275 

0.0783 

-0.2951 

AS 

0.3433 

•0.2366 

0.0101 

-0.1353 

0.1864 

MK 

0.0120 

-0.1408 

-0.1601 

0.1275 

-0.2210 

MC 

0.2594 

-0.2317 

-0.0577 

-0.0575 

0.0465 

El 

0.2220 

-0.2179 

-0.0580 

-0.0342 

0.0075 

VE 

0.0435 

-0.1939 

-0.1356 

0.0483 

-0.2955 

SPAT 

0.1737 

-0.2148 

-0.0907 

-0.0134 

-0.0620 

CPAC 

0.0135 

-0.0924 

•0.0818 

0.0601 

-0.1171 

CPSP 

0.0728 

-0.1278 

0.0008 

-0.0156 

-0.0538 

NMSA 

0.0276 

-0.1650 

-0.1145 

0.0688 

-0.2163 

PSYM 

0.2522 

-0.2282 

0.0214 

-0.0316 

0.0616 

SRAC 

0.0092 

-0.0718 

-0.0293 

-0.0292 

-0.0653 

SRSP 

0.0196 

-0.1014 

-0.0071 

0.0116 

-0.0809 

AUTO 

0.1373 

-0.0944 

0.0002 

0.0684 

0.0138 

SUPP 

0.0294 

-0.0527 

0.0635 

0.2415 

-0.0697 

ROUT 

-0.0808 

0.2245 

0.0482 

0.0084 

0.1119 

ADJU 

0.1666 

-0.0707 

0.0392 

0.1489 

0.0014 

DEPN 

•0.0298 

0.0489 

0.0340 

0.3069 

-0.1022 

COND 

0.1537 

-0.0341 

0.1304 

0.0869 

0.1296 

SURG 

0.1868 

-0.0412 

0.0790 

0.2955 

0.0076 

AUDI 

0.1781 

0.3074 

0.1378 

0.6719 

0.2014 

COMB 

1.0000 

0.0864 

0.3913 

0.1905 

0.5881 

FSER 

0.0864 

1 .0000 

0.1708 

0.3518 

0.2269 

PSER 

0.3913 

0.1708 

1.0000 

0.2216 

0.3364 

TECH 

0.1905 

0.3518 

0.2216 

1.0000 

0.2118 

MACH 

0.5881 

0.2269 

0.3364 

0.2118 

1.0000 
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TABLE  D-3:  Corrected  Validity  Coefficients  for  18  M OS  for 
Project  A  Data  with  CTP  Criterion 

PREDICTORS  1-9 


GS 

AR 

MO 

CS 

AS 

UK 

MC 

El 

VE 

1 1B 

0.656866 

0.621229 

0.536789 

0.463335 

0.483873 

0.634964 

0.545002 

0.565734 

0.634258 

12B 

0.683895 

0.623335 

0.482641 

0.354050 

0.574970 

0.612381 

0.612354 

0.668390 

0.659945 

13B 

0.422443 

0.380366 

0.333044 

0.275551 

0.435487 

0.329572 

0.419501 

0.390942 

0.400737 

16S 

0.493229 

0.531860 

0.403993 

0.399331 

0.327244 

0.539718 

0.400081 

0.419077 

0.493532 

19E 

0.599531 

0.550098 

0.428410 

0.348390 

0.491584 

0.561922 

0.535999 

0.522067 

0.548541 

27E 

0.660288 

0.620168 

0.632819 

0.561978 

0.520991 

0.571064 

0.552572 

0.611584 

0.654890 

31C 

0.477411 

0.544064 

0.306164 

0.186036 

0.370263 

0.533522 

0.453222 

0.468237 

0.436274 

54E 

0.577282 

0.620872 

0.443222 

0.409187 

0.500734 

0.594605 

0.550117 

0.571048 

0.549027 

55B 

0.481222 

0.419575 

0.366245 

0.358299 

0.377841 

0.372635 

0.453166 

0.417673 

0.543057 

63B 

0.461438 

0.403683 

0.215213 

0.245279 

0.580193 

0.353725 

0.549157 

0.513533 

0.382682 

64C 

0.300903 

0.351512 

0.098842 

0.075076 

0.379292 

0.314629 

0.377936 

0.350880 

0.226409 

67M 

0.398259 

0.390994 

0.249866 

0.200611 

0.361391 

0.399703 

0.398365 

0.403935 

0.363369 

71L 

0.451731 

0.523196 

0.374147 

0.332497 

0.224830 

0.561851 

0.351650 

0.338491 

0.473435 

76U 

0.724902 

0.677280 

0.415595 

0.470868 

0.693884 

0.632991 

0.705073 

0.693100 

0.684616 

76Y 

0.573827 

0.607247 

0.420033 

0.400435 

0.445024 

0.635650 

0.491260 

0.558312 

0.577437 

91A 

0.466726 

0.457812 

0.379704 

0.439253 

0.390980 

0.475335 

0.406133 

0.403509 

0.432202 

94B 

0.550727 

0.682143 

0.523106 

0.496724 

0.399889 

0.618427 

0.475801 

0.464941 

0  617696 

95B 

0.308507 

0.361463 

0.312987 

0.275644 

0.221359 

0.365255 

0.268593 

0.297468 

0.317382 

PREDICTORS 

10-17 

SPAT 

CPAC 

CPSP 

NMSA 

PSYM 

SRAC 

SUSP 

AUTO 

11B 

0.652374 

0.365831 

0.315400 

0.534503 

0.402093 

0.226193 

0.233807 

0.228662 

12B 

0.622855 

0.309670 

0.227002 

0.512925 

0.372447 

0.190950 

0.204843 

0.227686 

138 

0.490541 

0.280079 

0.218420 

0.370732 

0.325088 

0.129346 

0.221210 

0.241487 

16S 

0.539057 

0.327187 

0.161110 

0.491465 

0.412220 

0.116820 

0.117623 

0.091227 

19E 

0.589534 

0.366108 

0.231046 

0.515206 

0.372504 

0.201491 

0.191977 

0.147325 

27E 

0.522335 

0.295617 

0.277979 

0.582803 

0.410929 

0.178950 

0.162618 

0.159222 

31C 

0.449896 

0.272000 

0.086412 

0.403564 

0.262090 

0.084207 

0.097554 

0.067000 

54E 

0.603960 

0.318320 

0.248402 

0.511470 

0.379777 

0.226676 

0.138062 

0.173768 

S5B 

0.510010 

0.311239 

0.134092 

0.361530 

0.408399 

0.204356 

0.145157 

0.093111 

638 

0.518942 

0.161082 

0.235836 

0.310585 

0.359398 

0.180705 

0.167438 

0.214925 

64C 

0.396216 

0.219718 

0.189678 

0.285677 

0.256463 

0.150450 

0.112552 

0.161223 

67M 

0.451617 

0.202696 

0.058565 

0.306593 

0.283747 

0.082735 

0.048330 

0.080653 

71L 

0.524502 

0.369656 

0.159688 

0.416152 

0.275218 

0.189140 

0.159404 

0.084238 

76U 

0.717186 

0.321605 

0.339029 

0.595906 

0.486608 

0.379508 

0.164464 

0.191275 

76Y 

0.541140 

0.322173 

0.166824 

0.562101 

0.247266 

0.241203 

0.208201 

0.130668 

91A 

0.508260 

0.207015 

0.193966 

0.415018 

0.316744 

0.124493 

0.172211 

0.138501 

94B 

0.643599 

0.458344 

0.245147 

0.606390 

0.323287 

0.302815 

0.224887 

0.200528 

95B 

0.375857 

0.245767 

0.104942 

0.327871 

0.238337 

0.092203 

0.079345 

0.097684 
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TABLE  D-3  (CONT.)  :  Corrected  Validity  Coefficients  for  18  HOS 

for  Project  A  Data  with  CTP  Criterion 


PREDICTORS  18-25 


SUPP 

ROUT 

ADJU 

DEPN 

COND 

SURG 

AUDI 

COMB 

1 1B 

0.123723 

-0.30512 

0.239024 

0.152477 

-0.01476 

0.290709 

0.00232 

0.184334 

12B 

0.115598 

-0.29492 

0.204694 

0.063205 

-0.01829 

0.229732 

0.02302 

0.170702 

138 

0.170646 

-0.25147 

0.165706 

0.040654 

-0.02298 

0.125694 

-0.03220 

0.247007 

16S 

0.262131 

-0.24657 

0.192076 

0.144852 

-0.09337 

0.197470 

0.01368 

0.132213 

19E 

0.147059 

-0.31942 

0.209132 

0.152346 

-0.03104 

0.221872 

0.04348 

0.204331 

27E 

0.128286 

-0.23395 

0.180262 

0.021000 

-0.12227 

0.170586 

-0.11192 

0.166848 

31C 

0.049802 

-0.08717 

0.108294 

0.113786 

-0.09517 

0.126693 

0.11199 

0.108130 

54E 

0.089491 

-0.20924 

0.227338 

0.167239 

-0.07412 

0.269000 

-0.05607 

0.105195 

55B 

0.110772 

-0.29471 

0.193754 

-0.034219 

-0.07633 

0.152132 

-0.05980 

0.176590 

63B 

0.054786 

-0.15637 

0.193494 

0.065594 

-0.06677 

0.178713 

-0.09691 

0.304999 

64C 

0.092326 

-0.13812 

0.119468 

0.069535 

-0.04641 

0.112791 

-0.04330 

0.159918 

67N 

0.062287 

-0.17310 

0.191400 

0.184216 

-0.00327 

0.211099 

0.02633 

0.113485 

71 L 

0.096050 

-0.20412 

0.184007 

0.211097 

-0.03759 

0.256343 

0.09834 

0.029767 

76U 

0.185879 

-0.34674 

0.273591 

0.123225 

-0.09744 

0.305606 

0.00774 

0.169576 

76Y 

0.241912 

-0.27059 

0.171466 

0.148173 

-0.09447 

0.231602 

0.02011 

•0.016822 

91A 

0.184232 

-0.15618 

0.201112 

0.221338 

-0.08370 

0.237056 

0.01486 

0.181293 

94  B 

0.211675 

-0.25494 

0.211309 

0.169314 

-0.11001 

0.253456 

0.06162 

-0.039677 

95B 

0.110249 

-0.17103 

0.159484 

0.143164 

-0.02086 

0.180596 

-0.07999 

0.022969 

PREDICTORS  26-29 

FSER 

PSER 

TECH 

MACH 

1 1B 

-0.24569 

-0.08846 

0.04205 

-0.14864 

12B 

-0.20016 

-0.15054 

0.02201 

-0.07964 

13B 

-0.13923 

-0.04086 

-0.03497 

0.04404 

16S 

-0.11756 

-0.02380 

0.07341 

-0.13096 

19E 

-0.19852 

-0.07536 

0.11842 

-0.06009 

27E 

-0.20526 

-0.23876 

-0.03661 

-0.08600 

31C 

-0.03077 

-0.07153 

0.13561 

0.05820 

54E 

-0.17415 

-0.14690 

0.01373 

-0.07097 

55B 

-0.13820 

-0.13924 

-0.06987 

-0.00325 

63B 

-0.18927 

-0.09858 

-0.10459 

0.25888 

64C 

-0.14868 

-0.00988 

-0.00728 

0.13068 

67M 

-0.07037 

-0.05552 

0.06522 

-0.03315 

711 

-0.10148 

-0.02221 

0.13728 

-0.17941 

76U 

-0.18391 

-0.14190 

0.07675 

-0.05137 

76Y 

-0.17607 

-0.19286 

0.07027 

-0.16625 

91A 

-0.10266 

-0.04917 

0.04814 

-0.00726 

948 

-0.02029 

-0.12862 

0.10869 

-0.23578 

95B 

-0.12993 

-0.08489 

-0.01798 

-0.15076 
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TABLE  D-4:  Corrected  Validity  Coefficients  for  18  MOS  for 
"McLaughlin”  Data  with  SQT/Training  Criteria 


AS 

AS 

CS 

El 

GS 

MC 

MK 

NO 

VE 

osc 

0.522481 

0.466196 

0.301145 

0.511024 

0.500311 

0.519526 

0.477943 

0.327860 

0.487276 

1 1 B 

0.435398 

0.359757 

0.295275 

0.398889 

0.430805 

0.425119 

0.421797 

0.328748 

0.414520 

128 

0.456009 

0.421902 

0.234617 

0.437370 

0.424477 

0.482489 

0.412656 

0.296038 

0.379036 

138 

0.467479 

0.429503 

0.277034 

0.448068 

0.464670 

0.472093 

0.444128 

0.335162 

0.432633 

16S 

0.597241 

0.602851 

0.350482 

0.665018 

0.619113 

0.653891 

0.577616 

0.402013 

0.562170 

19E 

0.537569 

0.503318 

0.342678 

0.523518 

0.559232 

0.562900 

0.493701 

0.399920 

0.539033 

27E 

0.418988 

0.275268 

0.366791 

0.311033 

0.372870 

0.332871 

0.480805 

0.416963 

0.510881 

54E 

0.446692 

0.348940 

0.081376 

0.456362 

0.320026 

0.467917 

0.430625 

0.287750 

0.321035 

SSB 

0.512050 

0.324288 

0.404777 

0.423100 

0.460575 

0.416224 

0.564711 

0.408218 

0.488787 

638 

0.450968 

0.479444 

0.204551 

0.486390 

0.417487 

0.486575 

0.403886 

0.276881 

0.383260 

64C 

0.470434 

0.436968 

0.278294 

0.470200 

0.451799 

0.467933 

0.409456 

0.309199 

0.456755 

67N 

0.448919 

0.425099 

0.242493 

0.476322 

0.441937 

0.533555 

0.409983 

0.240398 

0.375627 

71L 

0.635395 

0.327525 

0.494356 

0.464185 

0.552146 

0.448577 

0.625413 

0.517749 

0.642766 

76W 

0.523521 

0.510653 

0.277521 

0.521921 

0.536168 

0.533771 

0.472523 

0.286529 

0.440204 

76Y 

0.531108 

0.312413 

0.431966 

0.402779 

0.490305 

0.416069 

0.568753 

0.425842 

0.507263 

91A 

0.211951 

0.195167 

0.178205 

0.182522 

0.231842 

0.219951 

0.183240 

0.169034 

0.195193 

948 

0.577372 

0.499129 

0.349491 

0.551992 

0.570028 

0.537964 

0.504077 

0.369834 

0.563089 

958 

0.562539 

0.441142 

0.378885 

0.511458 

0.536842 

0.502840 

0.525118 

0.417818 

0.537126 

NOTE:  The  validity  matrix  for  all  60  MOS  in  the  "McLaughlin" 
Data  Set  is  available  upon  request. 
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APPENDIX  E:  REMOVING  EFFECTS  OF  NEGATIVE 
ROOTS  ON  PROJECT  A  VALIDITY  MATRIX 


Proof  that  a  Matrix  Consisting  of  Column  Arrays 
of  Real  Numbers  Multiplied  by  Its  Transpose 
Must  Have  All  Positive  Eigenvalues 


1.  Notation: 

Y  =  a  rectangular  matrix  of  real  numbers;  the  column 
arrays  should  represent  test  scores  with  the  rows 
representing  individuals . 

My  -  Y'Y;  every  matrix  multiplied  by  its  transpose  is 
necessarily  a  square  matrix  with  all  diagonal  elements 
equal  to  the  sums  of  squares  of  real  numbers  and  the 
off  diagonal  elements  equal  to  cross  products  that  can 
be  either  positive  or  negative  real  numbers.  The 
diagonal  values  for  non-zero  column  arrays  are,  of 
course,  positive  numbers. 

M.  =  (YW) '  (YW)  =  f'Y'Y  W;  Defining  W  as  a  matrix  of 
weights  consisting  of  real  numbers,  it  is  seen  that  M* 
is  also  necessarily  a  square  matrix  with  all  diagonal 
elements  equal  to  either  positive  real  numbers  or 
zeros.  The  zeros  may  result  from  the  use  of  a  W  which 
transforms  some  of  the  columns  of  Y  to  columns  of 
zeros;  the  sums  of  squares  of  these  columns  are,  of 
course,  zeros. 

D  =  a  diagonal  matrix  of  eigenvalues;  these  are  the 
eigenvalues  to  which  this  proof  refers. 

A  =  a  square  eigenvector  matrix  such  that 
A' A  =  A  A'  =i,  and  A'  (Y'Y)  A  =  D;  it  is  well  known 
that  if  Y'Y  is  of  full  rank  the  latter  equation  will 
uniquely  exist. 

B  =  a  rectangular  orthonormal  matrix  such  that 
B'B  =  I,  B'B  is  an  idempotent  matrix,  and 
B'  (Y'Y)  B  =  D;  the  latter  equation  is  also  uniquely 
defined.  When  Y'Y  is  not  of  full  rank,  the  usual 
situation,  this  D  is  the  matrix  of  eigenvalues  we  are 
interested  in  proving  much  be  positive  numbers  when  Y 
consists  of  real  numbers. 
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2.  Demonstration: 

We  see  that  by  definition,  assuming  that  Y  consists  of 
real  numbers,  that  the  diagonal  elements  of  both  M,  and  M. 
must  all  be  positive  or  zero.  We  further  see  that  setting  W 
equal  to  either  A  or  B,  which  ever  is  appropriate,  assures 
that  M*  is  equal  to  D.  Thus,  all  elements  of  D  must  be 
positive  if  the  elements  of  Y  are  real  numbers,  and  we  can 
say  that  all  the  eigenvalues  of  Y'Y  will  necessarily  be 
positive  or  zero  if  Y  is  made  up  of  real  numbers.  If  even 
one  of  the  eigenvalues  of  the  matrix,  H,  are  negative  we  can 
say  with  certainty  that  M  is  not  equal  to  the  product  of  a 
score  matrix  (made  up  of  real  numbers)  and  its  transpose. 
That  is,  H  cannot  equal  Y'Y  for  any  Y  defined  as  above. 

3 .  Implications : 

A  matrix  of  raw  test  and/or  performance  scores  can  be 
first  transformed  into  deviate  scores  which  all  have  zeros 
as  their  column  means,  and  then  each  column  divided  by  its 
standard  deviation  to  create  standard  scores.  The  resulting 
matrix  Y  consists  of  standard  scores  when  considered  by 
columns.  Each  column  has  a  mean  of  zero  and  a  standard 
deviation  of  one.  Obtaining  a  new  Y  by  dividing  this 
intermediate  one  by  the  square  root  of  the  number  of  rows 
yields  a  matrix  for  which  Y'Y  yields  a  matrix  of  product 
moment  correlation  coefficients.  Thus,  a  correlation  matrix 
is  one  of  the  My  matrices  that  cannot  have  a  negative 
eigenvalue,  and  it  can  be  emphatically  stated  that  an 
alledged  correlation  matrix  possessing  even  one  negative 
eigenvalue  could  not  have  been  computed  from  a  single  set  of 
scores  obtained  from  the  same  sample.  Alledged  correlation 
matrices  yielding  one  or  more  negative  eigenvalues  can 
result  from  many  different  causes,  including:  the  use  of 
incomplete  data  to  compute  some  cells,  the  use  of 
tetrachoric  correlation  coefficients  for  some  cells,  and  the 
combining  of  the  results  from  several  samples  to  obtain  a 
covariance  or  correlation  matrix.  Usually  small  adjustments 
to  a  few  cells  can  provide  a  corrected  correlation  matrix 
which  has  all  positive  or  zero  eigenvalues. 
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Steps  to  Remove  the  Effects  of  Negative  Roots 
on  Project  A  Validity  Matrix 
(Source:  Whetzel,  1991) 


1.  V  R,’1  V'  =  Cp 

2 .  Rt  F,  Fq, 

_ T,  =  _ T2  =  _ 

V  Fv  Fp 

where  Fp  is  the  principal  components  solution  of  Cp 
Tt  =  At  D,'1/2  and  T2  is  found  by  solving  T2*  (FV*FV)T2  =  D2 

3.  Delete  factors  with  negative  roots  from  Fp  to  obtain  F^ 

4.  V+  =  F^F,,,' 

5.  Compute  V+  Rt*‘  V+ '  =  Cp+ 
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APPENDIX  F:  ANALYSIS  SAMPLE  GENERATION 


PROCEDURE  FOR  GENERATING  ANALYSIS  SAMPL 


a.  Notation 

N  =  number  of  entities  in  an  MOS  sample 
n  =  number  of  test  variables  (j=1...29) 
m  =  number  of  jobs  (i=1...18) 

R,  =  29x29  matrix  of  predictor  intercorrelations 

V  =  m  x  29  matrix  of  validity  coefficients 

X,,  =  N  x  30  matrix  of  random  normal  deviates  for  the 

mth  job  sample 

b.  Compute  for  each  of  the  18  job  samples,  29  synthetic 
test  scores  and  1  criterion  score  using  the  Gramian 
factor  solution  of  R^  as  the  transformation  matrix, 
b.l  F;  =  R,Vj  (A  D1/2  A*) 

where , 

F;  =  30x30  transformation  matrix  for  the  ith 

job  sample 

RjV;  =  matrix  of  29  test  intercorrelations  in 

rows  and  columns  1-29  plus  vectors  of  29 
validities  in  30th  row  and  column  for 
the  ith  job; 

R.  !  ^ 


^  {1.0 

A  =  eigenvectors  of  30x30  intercorrelation 

validity  matrix 

D  =  diagonal  matrix  of  eigenvalues  of 

intercorrelation-validity  matrix 


where , 

Yi  =  Nx30  matrix  of  test  and  criterion  scores 
(with  the  same  expected  parameters  as 
the  population)  for  N  entities  in  the 
ith  job  sample. 


c.  Compute  matrix  of  sums-of -squares  and  cross-products, 
Q—Yj'Yi,  for  each  job  sample. 

d.  Compute  the  vector  of  covariances,  Cir  for  each  job 
sample. 

d.l  Identify  q(,  the  30th  row  of  Q4. 


d.2  m*  =  (I '  Yj )  1/N 
where , 

nij  =  row  vector  of  means  of  29  predictors  and 

1  criteria  (1x30) 

1  =  summing  vector  of  Is. 
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d. 3  c;  =  (1/N)qj  -  (m*  nij)  ;  and  drop  the  30th  element. 

where , 

m*  =  a  scalar  which  is  the  30th  element  of 

mi* 

c-  =  1x29  vector  of  covariances  of  the 

predictors 

e.  Compute  analysis  sample  validity  matrix  (V,)  using  18 

CjS. 

e. l  For  each  job  sample,  compute  validities  for  29 

predictors:  v^  =  (Siin)  C;  (1/S;) 

where , 

v^  -  1x29  vector  of  validities  between  29 

predictors  and  1  criterion  variable  for 
each  job  sample 

S;  =  diagonal  matrix  (taken  from  Q;)  with  the 

variances  of  the  29  predictors  and  1  job 
sample  criterion  in  the  diagonal 
s;  =  scalar  which  is  the  covariance  of  the 

criterion  for  the  ith  job  sample;  it  is 
the  30th  element  of  the  matrix  Sj 

e. 2  Assemble  18x29  validity  matrix  (V,)  for  combined 

job  samples  using  v^. 

v.  =  (v.i' ,  Vtf'  ,  .  .  . ,  v.,*')  ' 

f.  Compute  analysis  sample  intercorrelation  matrix  (RJ 

f. l  Drop  criterion  variable  from  18  Q;S. 

f.2  Sum  the  18  QjS  weighted  by  sample  size: 

Qt  =  i«i18  N;  Qi 

f . 3  Compute  analysis  sample  intercorrelation  matrix: 

a.  =  [  i  =  i18  ^(mj)]  1/  imll% 

b.  C,  =  Q.  (1/  i  =  118  Nj)  -  m,'  (m,) 
where , 

Ct  =  combined  covariance  matrix  for  the 

18  job  samples 

Q,  =  combined  sums-of -squares  and  cross- 

products  matrix  for  the  18  job 
samples 

c.  R,  =  S'1*  C,  S-1/2 
where , 

S  =  diagonal  matrix  of  the  diagonal 

elements  of  C,  (i.e.,  predictor 
variables) . 
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TABLE  F-l 


INITIALIZATION  SEEDS  USED  TO  GENERATE  MATRIX  OF 
RANDOM  NORMAL  DEVIATES  (XJ  FOR  ANALYSIS  SAMPLE 


X„  for  mth 

MOS  sample _ Initialization  Seed 


XI 

1587175283 

X2 

1976274088 

X3 

1271376363 

X4 

1728981280 

X5 

450351973 

X6 

724467093 

X7 

2015440489 

X8 

277105243 

X9 

68383890 

X10 

1964442650 

Xll 

135371761 

X12 

605699282 

X13 

1656991152 

X14 

234809644 

X15 

805238231 

X16 

399052330 

X17 

2003666395 

X18 

1300943904 

TABLE  F-2:  Analysis  Sample  Predictor  Intercorrelations 
(see  Appendix  B  for  code  names) 


PREDICTORS  1-9 


GS 

AR 

NO 

CS 

AS 

MK 

HC 

El 

VE 

GS 

1.00000 

0.71978 

0.52743 

0.44542 

0.64820 

0.68728 

0.71156 

0.76236 

0.79520 

AR 

0.71978 

1.00000 

0.62800 

0.49502 

0.54084 

0.82568 

0.68904 

0.65296 

0.73448 

NO 

0.52743 

0.62800 

1.00000 

0.69626 

0.32077 

0.62096 

0.40977 

0.41260 

0.62451 

CS 

0.44542 

0.49502 

0  69626 

1.00000 

0.22223 

0.49864 

0.33390 

0.33251 

0.56959 

AS 

0.64820 

0.54084 

0.32077 

0.22223 

1.00000 

0.41681 

0.73911 

0.75078 

0.51914 

MK 

0.68728 

0.82568 

0.62096 

0.49864 

0.41681 

1.00000 

0.59577 

0.57768 

0.69993 

HC 

0.71156 

0.68904 

0.40977 

0.33390 

0.73911 

0.59577 

1.00000 

0.73880 

0.60463 

El 

0.76236 

0.65296 

0.41260 

0.33251 

0.75078 

0.57768 

0.73880 

1.00000 

0.66455 

VE 

0.79520 

0.73448 

0.62451 

0.56959 

0.51914 

0.69993 

0.60463 

0.66455 

1.00000 

SPAT 

0.67339 

0.73280 

0.53150 

0.49495 

0.56744 

0.67778 

0.73758 

0.60739 

0.62930 

CPAC 

0.31255 

0.35070 

0.29975 

0.31894 

0.19571 

0.33621 

0.26699 

0.26318 

0.36539 

CPSP 

0.33420 

0.30202 

0.32794 

0.31336 

0.25990 

0.30185 

0.32482 

0.27826 

0.29779 

NMSA 

0.59448 

0.72021 

0.70341 

0.55597 

0.41169 

0.67984 

0.50018 

0.49857 

0.66101 

PSYM 

0.46243 

0.43935 

0.34584 

0.29845 

0.46265 

0.37799 

0.54713 

0.46232 

0.38456 

SRAC 

0.22610 

0.21320 

0.16104 

0.17295 

0.19423 

0.18480 

0.20497 

0.20906 

0.24201 

SRSP 

0.23572 

0.23633 

0.27474 

0.26668 

0.16228 

0.22299 

0.20377 

0.20315 

0.25615 

AUTO 

0.25845 

0.24468 

0.20359 

0.16786 

0.24415 

0.20559 

0.23501 

0.24783 

0.28000 

SUPP 

0.13595 

0.12318 

0.16773 

0.17596 

0.04155 

0.13357 

0.05720 

0.09501 

0.20500 

ROUT 

-0.31485 

-0.30689 

-0.25269 

-0.23590 

-0.24090 

-0.26236 

-0.29184 

-0.26263 

-0.33297 

ADJU 

0.24471 

0.25308 

0.20380 

0.13960 

0.20404 

0.23283 

0.23388 

0.23516 

0.23970 

DEPN 

0.06508 

0.09893 

0.14136 

0.16064 

-0.03845 

0.15240 

0.01137 

0.03938 

0.10231 

COND 

-0.04943 

-0.02892 

-0.00414 

-0.03948 

-0.02557 

-0.02673 

-0.00974 

-0.04200 

-0.05606 

SURG 

0.22196 

0.26116 

0.23745 

0.20869 

0.14966 

0.25444 

0.18511 

0.20147 

0.24407 

AUDI 

0.01454 

0.02279 

0.01324 

0.02628 

-0.08044 

0.06490 

-0.00826 

-0.00617 

0.05884 

COMB 

0.15768 

0.05750 

-0.03191 

-0.06616 

0.34264 

0.00238 

0.24480 

0.22622 

0.03818 

FSER 

-0.19883 

-0.18273 

-0.14583 

-0.13293 

-0.22763 

-0.13563 

-0.22383 

-0.20691 

-0.18970 

PSER 

•0.09213 

-0.14403 

-0.14491 

-0.12370 

0.01535 

-0.15871 

-0.06066 

-0.05814 

-0.13622 

TECH 

-0.00629 

0.06884 

0.11075 

0.07234 

-0.13633 

0.13780 

-0.06826 

-0.04307 

0.05530 

MACH 

-0.14907 

-0.18398 

-0.27661 

-0.29726 

0.18344 

-0.21946 

0.04437 

0.00556 

-0.29738 

PREDICTORS  10-19 

SPAT 

CPAC 

CPSP 

NMSA 

PSYM 

SRAC 

SRSP 

AUTO 

SUPP  ROUT 

GS 

0.67339 

0.31255 

0.33420 

0.59448 

0.46243 

0.22610 

0.23572 

0.25845 

0.13595  -0.31485 

AR 

0.73280 

0.35070 

0.30202 

0.72021 

0.43935 

0.21320 

0.23633 

0.24468 

0.12318  -0.30689 

NO 

0.53150 

0.29975 

0.32794 

0.70341 

0.34584 

0.16104 

0.27474 

0.20359 

0.16773  -0.25269 

CS 

0.49495 

0.31894 

0.31336 

0.55597 

0.29845 

0.17295 

0.26668 

0.16786 

0.17596  -0.23590 

AS 

0.56744 

0.19571 

0.25990 

0.41169 

0.46265 

0.19423 

0.16228 

0.24415 

0.04155  -0.24090 

MK 

0.67778 

0.33621 

0.30185 

0.67984 

0.37799 

0.18480 

0.22299 

0.20559 

0.13357  -0.26236 

MC 

0.73758 

0.26699 

0.32482 

0.50018 

0.54713 

0.20497 

0.20377 

0.23501 

0.05720  -0.29184 

El 

0.60739 

0.26318 

0.27826 

0.49857 

0.46232 

0.20906 

0.20315 

0.24783 

0.09501  -0.26263 

VE 

0.62930 

0.36539 

0.29779 

0.66101 

0.38456 

0.24201 

0.25615 

0.28000 

0.20500  -0.33297 

SPAT 

1.00000 

0.38335 

0.41062 

0.61824 

0.59619 

0.21981 

0.26627 

0.22440 

0.09566  -0.29272 

CPAC 

0.38305 

1.00000 

-0.20501 

0.30353 

0.25474 

0.24055 

0.08598 

0.07289 

0.10032  -0.15041 

CPSP 

0.41062 

-0.20501 

1.00000 

0.42550 

0.37502 

0.06074 

0.35994 

0.10256 

0.04153  -0.13127 

NMSA 

0.61824 

0.30353 

0.42550 

1.00000 

0.44084 

0.19430 

0.31724 

0.20595 

0.15041  -0.27015 

PSYM 

0.59619 

0.25474 

0.37502 

0.44084 

1.00000 

0.13981 

0.28262 

0.16027 

0.06490  -0.21372 

SRAC 

0.21981 

0.24055 

0.06074 

0.19430 

0.13981 

1.00000 

0.11849 

0.05171 

0.05154  -0.09579 

SRSP 

0.26627 

0.08598 

0.35994 

0.31724 

0.28262 

0.11849 

1.00000 

0.07710 

0.05247  -0.12173 

AUTO 

0.22440 

0.07289 

0.10256 

0.20595 

0.16027 

0.05171 

0.07710 

1.00000 

0.27862  -0.15299 

SUPP 

0.09566 

0.10032 

0.04153 

0.15041 

0.06490 

0.05154 

0.05247 

0.27862 

1.00000  -0.24772 

ROUT 

-0.29272 

-0.15041 

-0.13127 

-0.27015 

-0.21372 

-0.09579 

-0.12173 

-0.15299 

-0.24772  1.00000 

ADJU 

0.22158 

0.09515 

0.13258 

0.21836 

0.19990 

0.07677 

0.14092 

0.12526 

0.10806  -0.20918 

DEPN 

0.04447 

0.09187 

0.00438 

0.09911 

-0.02457 

0.01941 

0.03227 

0.00744 

0.25074  -0.05688 

COND 

-0.03273 

-0.04344 

0.04337 

-0.00360 

0.10637 

-0.06069 

0.02220 

0  05974 

0.07655  -0.08549 

SURG 

0.19479 

0.11215 

0.11621 

0.22610 

0.12887 

0.05397 

0.10666 

0.19722 

0.34122  -0.26958 

AUDI 

0.01702 

0.03381 

0.01624 

-0.0’681 

-0.02126 

-0.00584 

-0.01819 

0.09742 

0.17444  -0.00857 

COMB 

0.14999 

-0.01369 

0.08413 

0.02551 

0.24652 

0.00801 

0.01637 

0.16565 

0.02968  -0.06994 

FSER 

•0.21667 

-0.09562 

-0.11637 

-0.17283 

-0.22699 

-0.07272 

-0.09337 

-0.10740 

-0.04133  0.20022 

PSER 

-0.09294 

-0.08502 

0.01028 

-0.11166 

0.01389 

-0.00095 

-0.00098 

-0.00419 

0.07959  0.05027 

TECH 

-0.01612 

0.05277 

-0.00688 

0.07134 

-0.04281 

-0.03367 

0.00136 

0.07628 

0.25116  -0.00017 

MACH 

-0.06805 

-0.13882 

-0.03936 

-0.22019 

0.06004 

-0.07074 

-0.08230 

0.03570 

-0.06237  0.12232 
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TABLE  F-2  (CONT.):  Analysis  Sample  Predictor 

Intercorrelations 

Predictors  20-29 

ADJU  DEPN  COND  SURG  AUDI  COMB  FSER  PSER  TECH  MACH 

GS  0.24471  0.06508  -0.04943  0.22196  0.01454  0.15768  -0.19883  -0.09213  -0.00629  -0.14907 

AR  0.25308  0.09893  -0.02892  0.26116  0.02279  0.05750  -0.18273  -0.14403  0.06884  -0.18398 

NO  0.20380  0.14136  -0.00414  0.23745  0.01324  -0.03191  -0.14583  -0.14491  0.11075  -0.27661 

CS  0.13960  0.16064  -0.03948  0.20869  0.02628  -0.06616  -0.13293  -0.12370  0.07234  -0.29726 

AS  0.20404  -0.03845  -0.02557  0.14966  -0.08044  0.34264  -0.22763  0.01535  -0.13633  0.18344 

MK  0.23283  0.15240  -0.02673  0.25444  0.06490  0.00238  -0.13563  -0.15871  0.13780  -0.21946 

HC  0.23388  0.01137  -0.00974  0.18511  -0.00826  0.24480  -0.22383  -0.06066  -0.06826  0.04437 

El  0.23516  0.03938  -0  04200  0.20147  -0.00617  0.22622  -0.20691  -0.05814  -0.04307  0.00556 

VE  0.23970  0.10231  -0.05606  0.24407  0.05884  0.03818  -0.18970  -0.13622  0.05530  -0.29738 

SPAT  0.22158  0.04447  -0.03273  0.19479  0.01702  0.14999  -0.21667  -0.09294  -0.01612  -0.06805 

CPAC  0.09515  0.09187  -0.04344  0.11215  0.03381  -0.01369  -0.09562  -0.08502  0.05277  -0.13882 

CPSP  0.13258  0.00438  0.04337  0.11621  0.01624  0.08413  -0.11637  0.01028  -0.00688  -0.03936 

NMSA  0.21836  0.09911  -0.00360  0.22610  -0.01681  0.02551  -0.17283  -0.11166  0.07134  -0.L2019 

PSTN  0.19990  -0.02457  0.10637  0.12887  -0.02126  0.24652  -0.22699  0.01389  -0.04281  0.06004 

SRAC  0.07677  0.01941  -0.06070  0.05398  -0.00585  0.00802  -0.07273  -0.00096  -0.03368  -0.07074 

SRSP  0.14092  0.03227  0.02220  0.10666  -0.01819  0.01637  -0.09337  -0.00098  0.00136  -0.08230 

AUTO  0.12526  0.00744  0.05974  0.19722  0.09742  0.16565  -0.10740  -0.00419  0.07628  0.03570 

SUPP  0.10806  0.25074  0.07655  0.34122  0.17444  0.02968  -0.04133  0.07959  0.25116  -0.06237 

ROUT  -0.20918  -0.05688  -0.08549  -0.26958  -0.00857  -0.06994  0.20022  0.05027  -0.00017  0.12232 

ADJU  1.00000  0.33293  0.22715  0.60838  0.04989  0.16496  -0.08336  0.03047  0.12820  -0.00922 

DEPN  0.33293  1.00000  0.13667  0.59075  0.20251  -0.04367  0.06396  0.03126  0.31410  -0.10894 

COND  0.22715  0.13667  1.00000  0.34333  0.07349  0.15782  -0.04547  0.13498  0.10826  0.14244 

SURG  0.60838  0.59075  0.34333  1.00000  0.17266  0.17082  -0.04036  0.07756  0.28651  -0.00400 

AUOI  0.04989  0.20251  0.07349  0.17266  1.00000  0.17159  0.31729  0.13361  0.67286  0.19789 

COMB  0.16496  -0.04367  0.15782  0.17082  0.17159  1.00000  0.09267  0.38155  0.17102  0.57606 

FSER  -0.08336  0.06396  -0.04547  -0.04036  0.31729  0.09267  1.00000  0.16751  0.35184  0.23177 

PSER  0.03047  0.03126  0.13498  0.07756  0.13361  0.38155  0.16751  1.00000  0.20934  0.34614 

TECH  0.12820  0.31410  0.10826  0.28651  0.67286  0.17102  0.35184  0.20934  1.00000  0.19457 

MACH  -0.00922  -0.10894  0.14244  -0.00400  0.19789  0.57606  0.23177  0.34614  0.19457  1.00000 


TABLE  F-3:  Analysis  Sample  Validity  Coefficients  for  18  MOS 

PREDICTORS  1-9 

GS  AR  NO  CS  AS  MK  MC  El  VE 

1 IB  0.616324  0.562951  0.473442  0.383062  0.468460  0.562460  0.477467  0.519798  0.600048 

12B  0.674561  0.591346  0.530004  0.329280  0.589070  0.607769  0.627996  0.693098  0.649865 

13B  0.407003  0.290783  0.239132  0.199683  0.444174  0.240448  0.402179  0.413035  0.388540 

16S  0.532369  0.564768  0.392423  0.399841  0.312493  0.551154  0.415636  0.349667  0.474066 

19E  0.643871  0.566385  0.452945  0.324618  0.526622  0.567517  0.613683  0.566852  0.583430 

27E  0.687531  0.657306  0.664267  0.530909  0.572547  0.640832  0.561024  0.626965  0.659186 

31C  0.538728  0.587352  0.348889  0.158829  0.461669  0.572805  0.472554  0.573544  0.475890 

54E  0.575748  0.644649  0.396164  0.392442  0.538695  0.593120  0.541257  0.558782  0.561591 

55B  0.501583  0.426438  0.364091  0.390143  0.338015  0.412250  0.414435  0.442930  0.585551 

638  0.478853  0.414483  0.280681  0.292236  0.565156  0.362594  0.541182  0.497081  0.418200 

64C  0.311752  0.375833  0.100337  0.134280  0.410017  0.339482  0.397166  0.368794  0.264535 

67M  0.416102  0.415564  0.310853  0.220211  0.424643  0.422864  0.464800  0.470580  0.401287 

711  0.466161  0.571514  0.441537  0.371583  0.224624  0.610297  0.376136  0.361487  0.550738 

76U  0.745418  0.715050  0.457936  0.523495  0.706044  0.669409  0.725134  0.723530  0.718848 

76Y  0.535583  0.558070  0.422517  0.366543  0.417376  0.581944  0.428227  0.509820  0.543894 

91A  0.495820  0.486555  0.335059  0.416737  0.409915  0.506523  0.428372  0.408871  0.420976 

948  0.609311  0.734807  0.631953  0.567449  0.401812  0.658941  0.518803  0.521831  0.678982 

95B  0.304223  0.411412  0.302817  0.305440  0.254008  0.370947  0.292374  0.289180  0.313460 
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APPENDIX  G:  CLASSIFICATION-EFFICIENT  PROGRAM 


c . . 

C  PROGRAM:  CLASSIFICATION-EFFICIENT  CLUSTERING  PROGRAM  C 

C  PURPOSE:  CLUSTER  JOBS  INTO  6,  9,  OR  12  JOB  FAMILIES  C 

C  UHILE  MAXIMIZING  HORST'S  DIFFERENTIAL  INDEX  C 

c . C 

c 


C  *  FUNCTIONS  AND  SUBROUTINES  APPEAR  FIRST  IN  THE  PROGRAM 
C 

C  *  SUBROUTINE  THAT  INCREASES  THE  ROW  DESIGNATORS  AS  NEEDED 
SUBROUTINE  CHECK(X,  Y,  MAX) 

INTEGER  X,  Y,  MAX 
IF  (X  .LE.  (MAX-1))  THEN 
IF  (Y  .GT.  MAX)  THEN 
X  «  X  ♦  1 
Y  =  X  ♦  1 
END  IF 
END  IF 
RETURN 
END 

C  •  SUBROUTINE  THAT  LOCATES  THE  SMALLEST  VALUE  IN  D  MATRIX 
SUBROUTINE  LOCATE (D,  X,  SMLCOL,  SMLROW) 

REAL  D 

DIMENSION  0(18,18) 

INTEGER  X,  I,  J,  SMLCOL,  SMLROW 
SMLCOL  =  1 
SMLROW  =  1 
DO  20  I  =  1,X 
DO  19  J  *  2,X 

IF  (D( I , J)  .LT.  D(SMLROW,  SMLCOL))  THEN 
SMLCOL  =  J 
SMLROW  =  I 
ENO  IF 

19  CONTINUE 

20  CONTINUE 


RETURN 

END 

C . C 

C  MAIN  PROGRAM  C 

C . c 


C  *  DECLARE  VARIABLES 

REAL  F.  K  TEMP,  C,  G,  GT,  MULT,  TOTAL,  GGT,  DG,  M,  SUM,  HD 
REAL  A,  SO,  S,  B,  D,  N1,  N2 

DIMENSION  F(18,32),  C(18),  G(18,18),  GT(18,18),  GGT(18,18) 
DIMENSION  DG(18),  M(18),  A(18,18),  B(18.18),  D(18,18) 
INTEGER  NUHCLS,  X,  Y,  NC,  I,  J,  NUM,  Q,  L,  P,  R1,  R2 
INTEGER  SCOL,  SROW,  SI,  S2,  Z,  R,  T 
CHARACTER'S  SOURCE 

C  *  VARIABLES  THAT  WILL  NEED  TO  BE  CHANGED  FOR  EACH  CONDITION 
C  *  NUMBER  OF  CLUSTERS  (6,  9,  12) 

NUMCLS  «  6 

C  *  DATA  SOURCE  (PROJA29,  PR0JA9,  MCGL) 

SOURCE  r  'MCGL' 

C  *  NUMBER  OF  JOBS  (ROWS) 

X  *  18 

C  *  NUMBER  OF  FACTORS- -COLUMNS  IN  F  MATRIX  (9  OR  18) 

Y  -  9 

C  *  VALUE  AFTER  CALCULATION  OF  18  CHOOSE  2 
NC  *  153 

C  *  READ  IN  DATA 

IF  (SOURCE  .EQ.  'PROJA29')  THEN 

READ  (5,20)  ((F(I , J),  J=1,18),  1=1,18) 

20  FORMAT  (18(1X,F9.6>) 

EM)  IF 

IF  ((SOURCE  .EG,  'PROJA9' )  .OR.  (SOURCE  .EG.  'MCGL'))  THEN 
READ  (5,28)  ((F(I.J),  J=1,9),  1=1,18) 

28  FORMAT  (9(1X,F9.6)) 

ENO  IF 

C  *  ZERO  OUT  SPACES  AT  END  OF  EACH  ROW  OF  THE  F  MATRIX 
DO  35  I  =  1,18 
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00  34  J  =  (Y«-1),  (Y+14) 

F(I, J)  =  0 

34  CONTINUE 

35  CONTINUE 

C  *  ADO  COUNTER  AND  JOB  DESIGNATORS  TO  THE  END  OF  THE  ROUS 
C  OF  THE  F  MATRIX 

K  =  0 

DO  40  I  *  1,18 
K  x  K  *  1.0 
F(I,(Y+D)  =  1.0 
F(I,(Y*2))  =  K 

40  CONTINUE 

C  •  WRITE  OUT  ORIGINAL  F  MATRIX  TO  CHECK  PROGRAM 
WRITE  (6,41) 

41  FORMAT  (/IX,  ‘THE  ORIGINAL  F  MATRIX  IS:1) 

WRITE  (6,43)  «F(1,J),  J=1,9),  1=1,18) 

43  FORMAT  (9(1X,F9.6)) 

C  *  SET  COUNTER  TO  TRACK  THE  NUMBER  OF  CLUSTERS 
NUM  «  X 

C  •  SET  COUNTER  FOR  THE  NUMBER  OF  ITERATIONS 
C  *  0  IS  THE  TOTAL  NUMBER  OF  ITERATIONS 
Q  =  X  -  NUMCLS  +  1 

C  •  CALCULATE  COLUMN  MEANS  OF  F  MATRIX- -STORE  IN  C  VECTOR 
C  *  THE  SAME  COLUMN  MEANS  OF  F  WILL  BE  USED  THROUGHOUT  THE  PROGRAM 
DO  50  I  *  1,Y 
TEMP  =  0 
DO  45  J  =  1,NUM 

TEMP  =  TEMP  +  F(J,I) 

45  CONTINUE 

C(I)  =  TEMP/NUM 
50  CONTINUE 

WRITE  (6,52) 

52  FORMAT  (/IX,  'COLUMN  MEANS  OF  F  MATRIX') 

WRITE  (6,54)  (C(l),  1=1, Y) 

54  FORMAT  (9(1X,F6.3)) 

C  *  BEGIN  URGE  LOOP  OF  THE  PROGRAM 
DO  450  L  »  1,0 
C  *  ZERO  OUT  VECTORS 
DO  60  I  =  1.X 
DG(I)  =  0 
M(I)  *  0 

60  CONTINUE 

C  *  ZERO  OUT  MATRICES 
DO  65  I  =  1.X 
DO  64  J  =  1,X 
G(I,J)  =  0 
GT ( I , J )  =  0 
GGT (I , J )  =  0 
A(I , J)  =  0 
BO ,  J)  =  0 
DO,  J)  =  0 

64  CONTINUE 

65  CONTINUE 

C  *  WRITE  OUT  ITERATION  NUMBER 
WRITE  (6,85) 

85  FORMAT  (/IX,  'ITERATION  NUMBER:') 

WRITE  (6,87)  L 
87  FORMAT  02) 

C  *  CALCULATE  G  MATRIX  OF  DEVIATIONS  (VALUE  IN  EACH  COLUMN  MINUS 
C  *  ITS  COLUMN  MEAN) 

DO  100  I  =  1 ,Y 
DO  95  J  «  1 ,NUM 

G(J,I)  *  (F( J,  I )  -  CO)) 

95  CONTINUE 

100  CONTINUE 

C  WRITE  (6,110) 

C  110  FORMAT  (/IX,  *G  MATRIX  OF  DEVIATIONS') 

C  WRITE  (6,112)  ((GO , J),  J*1,Y),  1=1, NUM) 

C  112  FORMAT  (9(1X,F9.6)) 

C  *  TRANSPOSE  G  MATRIX 

DO  120  I  =  1.NUM 
DO  119  J  =  1 ,Y 
GT{ J, I )  =  G(I , J) 
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119  CONTINUE 

120  CONTINUE 
C  *  CALCULATE  GO' 

DO  150  I  =  1 ,NUM 
DO  149  P  =  1,NUH 
TOTAL  =  0 
DO  140  J  =  1.Y 

MULT  =  0(1, J)  *  GT(J,P) 

TOTAL  *  TOTAL  ♦  MULT 
140  CONTINUE 

GGTd.P)  =  TOTAL 

149  CONTINUE 

150  CONTINUE 

C  *  FORM  DC  VECTOR  FROM  DIAGONAL  ELEMENTS  OF  GGT 
DO  ISO  I  =  1,NUN 
DG(I)  =  GGT (1,1) 

180  CONTINUE 
WRITE(6,181) 

181  FORMAT  (/IX,  *DG  VECTOR1) 

URITE(6,183)  (DG(I),  1=1, NUM) 

183  FORMAT  (18(1X,F6.3>) 

C  *  CREATE  A  VECTOR  CONTAINING  NUMBER  OF  JOBS  (M>  IN  EACH  FAMILY 
C  •  TO  BE  USED  IN  SUBSEQUENT  MULTIPLICATIONS 
DO  185  I  *  1,NUM 
M(I)  =  F(I,(Y+1)) 

185  CONTINUE 
WRITE(6,186) 

186  FORMAT  (/IX,  *M  VECTOR') 

WRITE(6, 188)  (M(I),  1=1, NUM) 

188  FORMAT  (18(1X,F5.3)) 

C  *  CALCULATE  HORST'S  DIFFERENTIAL  INDEX 
SUM  =  0 

DO  190  I  =  1,NUM 

SUM  =  SUM  ♦  (DG(I )  *  HO)) 

190  CONTINUE 

HD  >  SUM 
WRITE  (6,195) 

195  FORMAT  (/IX,  'HORST  INDEX') 

WRITE  (6,197)  HD 
197  FORMAT  (IX,  F9.6) 

C 

C  *  STOP  PROGRAM  ON  LAST  ITERATION  SO  THAT  HD  IS  CALCULATED  BUT 
C  *  NOTHING  ELSE  IS  DONE 
C 

IF  (L  .EQ.  Q)  GO  TO  500 
C 

C  *  CALCULATE  “A"  MATRIX 
R1  =  1 
R2  =  R1  ♦  1 
DO  250  I  =  1,NC 
TOTAL  =  0 

CALL  CHECK(R1,R2,NUM) 

TOTAL  =  (M(R1 )  *  DG(R1))  +  (M(R2)  *  DG(R2)) 

A(R1 ,R2)  *  TOTAL 
A(R2,R1)  «  TOTAL 
A(R1,R1)  =  1.0 
R2  =  R2  +  1 
250  CONTINUE 

A(NUM.NUM)  =  1.0 
WRITE  (6,260) 

260  FORMAT  (/IX,  'A  MATRIX') 

WRITE  (6,262)  ((A(I,J),  J=1,X),  I=1,X> 

262  FORMAT  (18(1X,F6.3)) 

C  *  CALCULATE  "B"  MATRIX 
R1  »  1 
R2  *  R1  ♦  1 
DO  290  I  =  1 ,NC 
TOTAL  =  0 
SQ  *  0 

DO  280  J  =  1 , Y 

CAi I  CHFCttRI  02  NUM) 

S  =  (((M(R1)*F(R1,J))*(M(R2)*F(R2,J)))/(M(R1)*M(R2>)) 
♦  -  C(J) 
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so  =  s  *  s 

TOTAL  =  TOTAL  +  SO 
280  CONTINUE 

B(R1 ,R2)  =  TOTAL  *  (N(R1)  +  M(R2)) 

B(R2,R1 )  =  TOTAL  *  (H(R1)  ♦  M(R2)) 

B(R1,R1)  =  0 
R2  =  R2  +  1 
290  CONTINUE 

URITE  (6,295) 

295  FORMAT  (/IX,  'B  MATRIX') 

URITE  (6,297)  ((B(I,J),  J=1,X),  1=1, X) 

297  FORMAT  (18(1X,F6.3)) 

C  *  CALCULATE  (A-B)  TO  GET  A  D  MATRIX 
DO  320  I  =  1,NUM 
DO  319  J  =  1,NUM 

D(I, J)  =  A( I , J)  -  B(I,J) 

319  CONTINUE 

320  CONTINUE 
WRITE  (6,325) 

325  FORMAT  (/IX,  >D  MATRIX') 

URITE  (6,327)  ((0(1, J).  J=1.X).  1=1, X) 

327  FORMAT  (18(1X,r6.3)) 

C  *  LOCATE  SMALLEST  VALUE  IN  D  MATRIX 
CALL  LOCATE(D,NUM,SCOL,SROU) 

WRITE  (6,340) 

340  FORMAT  (/IX,  'THE  SMALLEST  VALUE  IN  D  IS  IN  COLUMN:') 

URITE  (6,343)  SCOL 
343  FORMAT  (IX,  12) 

URITE  (6,345) 

345  FORMAT  (IX,  'AND  ROU:') 

URITE  (6,343)  SROU 

C  •  CALCULATE  WEIGHTED  AVERAGE  FOR  THE  NEU  ROU  OF  F  MATRIX 
DO  380  J  =  1,Y 

N1  «  F(SROW.  (Y+1)) 

N2  =  F(SCOL,  (Y*1)) 

F(SROU.J)  =  ((N1*F(SROU, J))  +  (N2*F(SCOL, J)))/(N1+N2) 
380  CONTINUE 

C  *  STORE  JOB  DESIGNATORS  AT  ENO  OF  APPROPRIATE  F  MATRIX  ROW 

51  =  INT(NI) 

52  ■  INT(N2) 

Z  «  Y  ♦  1 

P  «  SI  ♦  1 
DO  390  R  «  1.S2 

F(SROU,  (Z+P))  =  F(SCOL,  (Z*R)) 

P  =  P  +  1 
390  CONTINUE 

C  *  INCREMENT  N  COUNTER  IN  THE  COMBINED  ROU  OF  F  MATRIX 
F(SROU,Z)  •  N1  +  N2 

C  *  CLOSE  REMAINING  ROUS  TOGETHER  IN  THE  F  MATRIX 
T  =  HUM  -  SCOL 
DO  400  I  =  1.T 

DO  398  J  *  1,(Y+14) 

F(SCOL, J)  *  F((SCOL*1), J) 

398  CONTINUE 

SCOL  «  SCOL  ♦  1 
400  CONTINUE 

URITE  (6,410) 

410  FORMAT  (/IX,  *F  MATRIX') 

IF  ((SOURCE  .EQ.  'PR04A9')  -OR.  (SOURCE  .EQ.  'MCGL' ))  THEN 
URITE  (6,412)  ((F(1,J),  J=1,22>,  I=1,(NUM-1)) 

412  FORMAT  (22(1X,F5.2)) 

ENO  IF 

IF  (SOURCE  .EO.  'PROJA29' )  THEN 
WRITE  (6,416)  ((F(I , J),  J=19,31),  I=1,(NUM-1)) 

416  FORMAT  (13(1X,F5.2)) 

END  IF 

C  *  SET  VALUES  FOR  NEXT  ITERATION 
NUN  «  HUM  -  1 
NC  =  NC  -  NUN 
450  CONTINUE 

C  *  URITE  OUT  FINAL  F  MATRIX 

500  WRITE  (6,501) 

501  FORMAT  (/IX,  'FINAL  F  MATRIX  ANO  JOB  CLUSTERS') 
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WRITE  (6,510)  «F(l,J),  J=1,22),  I=1,NUMCLS) 
510  FORMAT  (22<1X,F5.2)) 

512  STOP 
ENO 
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APPENDIX  H:  SELECTION-EFFICIENT  CLUSTERING  PROGRAM 


. . 

PROGRAM:  SELECTION-EFFICIENT  CLUSTERING  PROGRAM  C 

PURPOSE:  CLUSTER  JOBS  INTO  6,  9,  OR  12  JOB  FAMILIES  C 
WHILE  MAXIMIZING  PREDICTIVE  VALIDITY  C 

. . . 


*  FUNCTIONS  AND  SUBROUTINES  APPEAR  FIRST  IN  THE  PROGRAM 

*  SUBROUTINE  THAT  INCREASES  THE  ROW  DESIGNATORS  AS  NEEDED(CHOOSE  3) 

*  CALL  CHECK3(R1,R2,R3,X) 

*  (FOR  STAGE  1) 

SUBROUTINE  CHECK3(X, Y.Z.MAX) 

INTEGER  X,  Y,  Z,  MAX 
IF  (X  .LE.  (MAX-2))  THEN 
IF  (Y  .EQ.  (MAX-1))  THEN 
X  *  X+1 

Y  *  X+1 
Z  *  Y+1 
END  IF 

END  IF 

IF  (Z  .GT.  MAX)  THEN 
Y  =  Y+1 
Z  *  Y+1 
END  IF 
RETURN 
ENO 

C  *  SUBROUTINE  THAT  INCREASES  THE  ROW  DESIGNATORS  AS  NEEDED (CHOOSE  2) 
C  *  CALL  CHECK2(R1,R2,X) 

C  *  (FOR  STAGE  1) 

SUBROUTINE  CHECK2(X.Y,MAX) 

INTEGER  X,  Y,  MAX 
IF  (X  .LE.  (MAX-1))  THEN 
IF  (Y  .GT.  MAX)  THEN 
X  *  X+1 

Y  «  X+1 
END  IF 

END  IF 
RETURN 
END 

C  *  FUNCTION  LOWR2  RETURNS  ROW  NUMBER  WITH  LOWEST  R2  VALUE 
C  •  ONLY  CONSIDERS  ROUS  ELIGI8LE  TO  BE  SELECTED 
C  *  (FOR  STAGE  2) 

INTEGER  FUNCTION  LOW!2(M,X,Y) 

INTEGER  X,  Y.  TEMPRW 
REAL  M,  TEMPRl ,  ICV 
DIMENSION  M(12,45) 

LOW  >  800.00 
TEMPRW  *  0 
DO  20  I  -  1.X 

IF  (M(I,(Y+2))  .EQ.  0.0)  THEN 
TEMPRL  »  M(I,Y) 

IF  (TEMPRL  .LT.  LOW)  THEN 
LOW  *  TEMPRL 
TEMPRW  *  I 
ENO  IF 
ENO  IF 

20  CONTINUE 

LOWR2  «  TEMPRW 

RETURN 

END 

C  •  SUBROUTINE  HIGHR2  RETURNS  ROW  NUMBER  WITH  THE  HIGHEST  R2  VALUE 
C  *  CALL  HIGHR2(M1,PROW1,RN,Y+1) 

C  *  (FOR  STAGE  2) 

SUBROUTINE  HIGHR2(M, FINAL, X,Y) 

INTEGER  X,  Y,  TEMPRW,  I,  FINAL 
REAL  M,  TEMPRL,  HIGH 
DIMENSION  M(12,45) 

HIGH  *  -1.00 
TEMPRW  *  1 
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DO  40  I  =  1,X 

TEMPRL  =  M(I,Y) 

IF  (TEMPRL  .GT.  HIGH)  THEN 
HIGH  =  TEMPRL 
TEMPRW  *  I 
END  IF 

40  CONTINUE 

FINAL  -  TEMPRW 
WRITE  (6,45)  FINAL 

45  FORMAT  (/IX,  'HIGHEST  R2:',  12) 

RETURN 

EM) 

C  *  SUBROUTINE  CALC2  CREATES  TEMPORARY  MATRICES  Ml  AND  M2  WHEN  THERE 
C  *  ARE  ONLY  2  JOBS  IN  THE  ROW  TO  BE  COMBINED  WITH  ALL  OTHER  ROWS 
C  *  CALL  CALC2(V,CJOB,ROW,HOWMNY,NUMCOL,Y,RN,M1,M2) 

C  *  (FOR  STAGE  2) 

SUBROUTINE  CALC2(M,X,Y, I , J,K,L,A1 ,A2) 

INTEGER  I,  J,  K,  L,  A,  N,  X,  Y 
REAL  M,  A1 ,  A2,  TEMPI,  TEMP2 

DIMENSION  M(18,29),  A1(12,45),  A2(12,45),  X(18),  Y(18) 

DO  60  N  1  1,K 

IF  (N  .LE.  I)  THEN 
A1(L, (N+K+3))  “  Y(N) 

A2(L, (N+K+3))  =  Y(N) 

END  IF 
TEMPI  *  0 
TEMP2  *  0 
DO  55  A  *  1,1 

TEMPI  “  TEMPI  +  M(Y(A),N) 

TEMP2  *  TEMP2  +  M(Y(A),N) 

55  CONTINUE 

AI(L.N)  *  (TEMPI  +  M(X(1 ),N))/( 1+1 ) 

A2(L,N)  *  (TEMP2  ♦  M(X(2),N))/(I+1) 

60  CONTINUE 

A1(L,(N+1))  “1+1 
A1(L,(N+2))  »  1 
A1(L,(N+A+2))  =  X(1) 

A2(L,(N+1))  »  I  ♦  1 
A2(L,(N+2))  *  1 
A2(L,(N+A+2))  =  X(2) 

RETURN 

END 

C  *  SUBROUTINE  CALC3  CREATES  TEMPORARY  MATRICES  Ml,  M2,  M3  WHEN  THERE 
C  *  ARE  3  JOBS  IN  THE  ROW  TO  BE  COMBINED  WITH  ALL  OTHER  ROWS 
C  *  CALL  CALC3(V,CJOB,ROW,HOUMNY,NUMCOL,Y,RN,M1,MZ,M3) 

C  *  (FOR  STAGE  2) 

SUBROUTINE  CALC3(M,X,Y, I , J,K,L,A1 ,A2,A3) 

INTEGER  I,  J,  K,  L,  A,  N,  X,  Y 

REAL  M,  A1 ,  A2,  A3,  TEMPI,  TEMP2,  TEMP3 

DIMENSION  M(18,29),  A1(12,45),  A2(12,45),  A3(12,45) 

DIMENSION  X(18),  Y(18) 

DO  80  N  *  1,K 

IF  (N  .LE.  I)  THEN 
Al(L, (N+K+3))  “  Y(N) 

A2(L. (N+K+3))  =  Y(N) 

A3(L, (N+K+3))  “  Y(N) 

END  IF 
TEMPI  *  0 
TEMP2  =  0 
TEMP3  «  0 
DO  75  A  »  1,1 

TEMPI  *  TEMPI  ♦  M(Y(A),N) 

TEMP2  *  TEMP2  +  M(Y(A),N) 

TEMP3  *  TEMP3  ♦  M(Y(A),N) 

75  CONTINUE 

A1(L,N)  -  (TEMPI  ♦  M(X(1),N))/(I+1) 

A2(L,N)  *  (TEMP2  ♦  M(X(2),N))/(  1+1 ) 

A3(L,N)  *  (TEMP3  ♦  M(X(3),N))/(I+1) 

80  CONTINUE 

A1(L,(N+1))  “1+1 
A1(L,(N+2))  =  1 
A1(L,(N+A+2))  =  X(1) 

A2(L,(N+1))  *1+1 
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A2(L ,(N+2))  =  1 
A2(L,(N+A+2))  *  X(2) 

A3(L,(N+1))  ■  I  ♦  1 
A3(L,(N+2))  *  1 
A3(l, (N+A+2))  *  X<3) 

RETURN 

EM) 

C  *  SUBROUTINE  AVGUR2  CALCULATES  THE  AVERAGE  WEIGHTED  R2  VALUE 
C  *  CALL  AVGWR2(TENP1,NEWVL1,Y«-1,Y*2,NUHCLS) 

SUBROUTINE  AVGWR2(M, VALUE, XI ,Y1 ,Z> 

INTEGER  XI,  Y1,  Z,  I 
REAL  M,  TENf>,  VALUE 
DIMENSION  M(12,45) 

TEMP  *  0 
DO  90  I  *  1,Z 

TEMP  -  TEMP  ♦  <M(I,X1)  *  M(I.YI)) 

90  CONTINUE 

VALUE  *  TEMP/18 

RETURN 

ENO 

C  •  SUBROUTINE  COPYMX  COPIES  AN  ENTIRE  MATRIX  ONTO  ANOTHER  MATRIX 
C  *  CALL  COPYMXLNEWRES, TEMPI, Y.NUMCLS)  OR 
C  *  CALL  C0PYMX(TEMP1,NEWRES, Y.NUMCLS) 

C  *  (FOR  STAGE  2) 

SUBROUTINE  COPYMX(A,B,C,D) 

INTEGER  I,  J,  C,  0 
REAL  A,  B 

DIMENSION  A(12,45),  B<12,45) 

DO  100  I  «  1,0 

00  99  J  ■  1,(C*16) 

B(I, J)  <=  A(I , J) 

99  CONTINUE 

100  CONTINUE 
RETURN 
END 

C  *  SUBROUTINE  CREAT2  WILL  CREATE  2  TEMPORARY  MATRICES  WITH  THE  ROW 
C  *  HAVING  THE  HIGHEST  R2  FROM  Ml  AND  M2,  RESPECTIVELY,  SUBSTITUTED 
C  *  INTO  TEMPI  AND  TEMP2- -  IT  WILL  THEN  BE  POSSIBLE  TO  CALCULATE  THE 
C  *  AVERAGE  WEIGHTED  R2  TO  DETERMINE  IF  ONE  OF  THESE  SHOULD  REPLACE 
C  •  NEWRES 

C  *  CALL  CREAT2(V,RINV,M1,M2,PROW1,PROW2,CJOB,Y,NUMCOL,NUMCLS,Y+16, 

C  *  CROW, TEMPI ,  TEMP2) 

C  *  (FOR  STAGE  2) 

SUBROUTINE  CREAT2(A.R,B1 ,B2,X1 ,X2, Y1 ,K,L,M,N,P,TP1 ,TP2) 

INTEGER  I,  J,  K,  L,  M,  N,  P,  XI,  X2,  Y1 
REAL  A,  R,  B,  C,  TP1 ,  TP2,  SUM1 ,  SUM2,  01,  02 
REAL  H0LD1,  HOLD 2,  RSQl,  RSQ2 

DIMENSION  A(18,29),  R(29,29),  81(12,45),  B2(12,45),  Y1(18) 
DIMENSION  TP1(12,45),  TP2(12,45),  HOLDK29),  HOLD2(29) 

DO  140  I  «  1,M 

IF  (TP1(I,(K+4»  .EO.  B1(X1,(K+4)))  THEN 
DO  120  J  =  1,N 

TPI(I.J)  *  BI(XI.J) 

120  CONTINUE 

ENO  IF 

IF  (TP2(I,(K+4))  .EO.  B2(X2,(K*4)))  THEN 
DO  130  J  *  1,N 

TP2(I,J)  =  B2(X2, J) 

130  CONTINUE 

END  IF 

140  CONTINUE 

C  *  FIX  THE  TEMPORARY  MATRICES  SO  THAT  THE  CROW (CHOSEN  ROW)  WHICH  HAS 
C  *  ONLY  ONE  JOB  LEFT  HAS  THE  CORRECT  V  VECTOR,  RECALCULATE  R2,  AND 
C  *  SET  COUNTER  CORRECTLY 
DO  150  J  *  1,K 

TPKP.J)  =  (A(Y1(2),J))/(L-1) 

TP2(P, J)  -  (A(Y1(1),J))/(L-1) 

150  CONTINUE 

DO  160  I  -  1,K 
SUM1  «  0 
SUM2  «  0 
DO  155  J  *  1,K 

01  -  TPKP.J)  *  R( J, I ) 
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SDMI  =  SUM1  +  01 

02  =  TP2(P,J)  *  R( J,  1 ) 

SUM2  =  SUH2  ♦  02 
155  CONTINUE 

HOLD  1(1)  =  SUH1 
H0LD2O)  =  SUN2 
160  CONTINUE 

RS01  >  0 
RS02  =  0 
DO  170  J  *  1,X 

Q1  =  HOLDKJ)  *  TP1(P,J) 

RS01  =  RS01  +  01 

Q2  =  H0LD2(J)  *  TP2(P,J) 

RS02  *  RS02  ♦  02 
170  CONTINUE 

C  *  STORE  R2  VALUE,  COUNTER  VALUE,  AND  JOB  DESIGNATORS 
TP1(P, J)  =  RSQ1 
TP2(P, J)  «  RS02 
TP1(P,(J*1))  *  L-1 
TP2(P,(J+1))  -  L-1 
TP1(P,(J*3))  =  Y1(2) 

TP1(P,(J+4))  =>  0,0 
TP1(P,(J+5))  *  0.0 
TP2(P,(J+3))  *  Y1(1) 

TP2(P,(J*4))  =  0.0 
TP2(P,(J*5))  =  0.0 
RETURN 
END 

C  *  SUBROUTINE  CREAT3  WILL  CREATE  3  TEMPORARY  MATRICES  WITH  THE  ROW 
C  *  HAVING  THE  HIGHEST  R2  FROM  Ml,  M2,  M3,  RESPECTIVELY,  SUBSTITUTED 
C  *  INTO  TEMPI, TEMP2.TEMP3.  IT  WILL  THEN  BE  POSSIBLE  TO  CALCULATE  THE 
C  *  AVERAGE  WEIGHTED  R2  TO  DETERMINE  IF  ONE  OF  THESE  SHOULD  REPLACE 
C  *  NEURES 

C  *  CALL  CREAT3(V,RINV,M1,H2,M3,PROUl ,PROU2,PROW3,CJOB, Y.NUMCOL, 

C  *  NUMCLS, Y+16, CROW, TEMPI ,TEMP2,TEMP3) 

C  *  (FOR  STAGE  2) 

SUBROUTINE  CREAT3(A,R,B1,B2,B3,X1.X2,X3,Y1,X.L,M,N,P,TP1,TP2,TP3) 
INTEGER  I,  J.  X,  L,  M,  N,  P.  XI,  X2.  X3,  Y1 
REAL  A,  R,  B,  C,  TP1,  TP2,  TP3,  SUN1,  SUN2,  SUK3,  01,  02,  03 
REAL  H0LD1 ,  H0LD2,  H0LD3,  RSQ1 ,  RSQ2,  RSQ3 
DIMENSION  A(18,29),  R(29,29),  81(12,45),  B2(12,45),  83(12,45) 
DIMENSION  Y1(18),  TP1(12,45),  TP2(12,45),  TP3(12,45) 

DIMENSION  HOLDK29),  HOLD2(29),  H0LD3(29) 

DO  250  I  *  1,M 

IF  (TP1(l,(K+4))  .EQ.  B1(X1, (K+4)))  THEN 
DO  220  J  «  1,N 

TP1(I , J)  -  B1(X1, J) 

220  CONTINUE 

END  IF 

IF  (TP2(l,(K*4))  .EQ.  B2(X2, (K*-4)»  THEN 
DO  230  J  =  1 , N 

TP2(I ,J)  -  B2(X2, J) 

230  CONTINUE 

END  IF 

IF  (TP3(I,(K+4))  .EQ.  B3(X3,(K+4)))  THEN 
00  240  J  *  1,N 

TP3(I, J)  =  B3(X3, J) 

240  CONTINUE 

ENO  IF 

250  CONTINUE 

C  *  FIX  THE  TEMPORARY  MATRICES  SO  THAT  THE  CROWCCHOSEN  ROW)  WHICH  HAS 
C  *  TWO  JOBS  LEFT  HAS  THE  CORRECT  V  VECTOR,  RECALCULATE  R2,  AND 
C  *  SET  COUNTER  CORRECTLY 
DO  252  J  =  1.X 

TPKP.J)  *  (A(Y1(2), J)  ♦  A(Y1(3), J))/(L-1 ) 

TP2(P, J)  =  (A(Y1(1), J)  ♦  A(Y1(3),J))/(L-1) 

TP3(P,J)  «  (A(Y1(1), J)  ♦  A(Y1(2),J))/(L-1) 

252  CONTINUE 

DO  260  I  *  1,K 
SUM1  «  0 
SUM2  *  0 
SUM3  =  0 
DO  255  J  *  1,X 


H-4 


uuuuuuuuu 


Ql  *  TPKP.J)  *  R( j, I ) 

SUH1  =  SUM1  ♦  Ql 
02  =  TP2(P, J)  *  R( J, I ) 

SUM2  *  SUM2  *  02 
03  *  TP3(P, J)  »  R(J, I ) 

SUM3  «  SUM3  ♦  03 
255  CONTINUE 

HOLDKI)  -  SUM1 
HOLD  2  (  I  )  *  SUM2 
H0LD3( I )  *  SUM3 
260  CONTINUE 

RSQ1  «  0 
RSQ2  «  0 
RS03  *  0 
DO  270  J  *  1.K 

01  -  HOLDK J)  *  TPKP.J) 

RSQ1  *  RS01  *  Ql 

02  «  H0LD2(J)  *  TP2CP.J) 

RSQ2  «  RS02  ♦  02 

03  *  HOLD3(J)  *  TP2(P. J) 

RSQ3  *  RS03  ♦  03 
270  CONTINUE 

C  *  STORE  R2  VALUE.  COUNTER  VALUE.  ANO  JOS  DESIGNATORS 
TPKP.J)  *  RS01 
TP2CP.J)  *  RSQ2 
TP3<P, J)  «  RSQ3 
TPKP,(J+1))  *  L-1 
TP2<P.<J+D)  *  L-1 
TP3(P,(J*1))  *  L-1 
TP1(P,(J+3))  *  Y1(2) 

TP1(P,(J+4))  *  Y1(3) 

TP1(P,(J+5))  *  0.0 
TP2(P,(J+3))  «  YK1) 

TP2(P,(J+4))  *  Y1(3) 

TP2(P,(J*5))  ■  0.0 
TP3(P,(J+3))  «  YK1) 

TP3(P,(J+4))  *  Y1(2) 

TP3(P,( J+5))  *  0.0 

RETURN 

END 


MAIN  PROGRAM  (STAGE  1) 

STAGE  1  WILL  AVERAGE  ALL  POSSIBLE  COMBINATIONS  OF  3  OR 
2  ROWS  DEPENDING  ON  THE  CONDITION,  CALCULATE  R2  FOR 
EACH  ROW,  PICK  LARGEST  R2,  ANO  END  UP  WITH  EITHER 
6,  9,  OR  12  ROWS  (ALL  WITH  DIFFERENT  JOBS)  WITH  THE 
HIGHEST  R2. 


*  DECLARE  VARIABLES  (BOTH  STAGES) 

REAL  RINV,  V,  RES,  TRES,  TOTRV,  MULT,  TOTAL,  PI,  P2,  P3 
REAL  HRES,  NEURES,  Ml,  M2,  M3,  T0TRV1,  T0TRV2,  T0TRV3 
REAL  TM1 ,  TM2,  TM3,  T0TAL1 ,  T0TAL2,  T0TAL3,  MULTI,  MULT2,  MULT3 
REAL  TEMPI,  TEMP2,  TEMP3,  ORGVAL,  NEWVL1,  NEWVL2,  NEWVL3 
DIMENSION  RINV(29,29),  V(18,29),  RES(816,33),  TRES(29,816) 
DIMENSION  TOTRV(816,29),  HRES(12,33),  NEWRES{12,45) 

DIMENSION  MK12.45),  M2<12,45),  M3(12,45) 

DIMENSION  TEMPK12.45),  TEMP2(12,45),  TEMP3(12,45) 

DIMENSION  CJ0B(18),  R0W(18) 

DIMENSION  T0TRV1(12,29),  TOTRV2(12,29),  T0TRV3(12,29) 

DIMENSION  TM1(29,12),  TM2(29,12),  TM3(29,12) 

INTEGER  NUMCLS,  I,  J,  X  x,  R1,  R2,  R3,  NC,  VAL,  P 
INTEGER  LOCLRG,  K,  L,  ''cOW,  NUMCOL,  HOUMNY,  COUNT,  RN,  G 
INTEGER  CJOB,  ROW,  PR0W1 ,  PR0W2,  PR0W3 
CHARACTER*^  SOURCE 
C  *  VARIABLES  THAT  WILL  NEED  TO  BE  CHANGED  FOR  EACH 
C  *  DIFFERENT  CONDITION 
C  *  NUMBER  OF  CLUSTERS  (06,  09,  OR  12) 

NUMCLS  *  12 

C  *  DATA  SOURCE  (PR0JA29,  PR0JA9,  OR  MCGL) 

SOURCE  «  'PR0JA29' 

C  *  NUMBER  OF  ROWS  IN  VALIDITY  MATRIX 
X  *  18 

C  *  NUMBER  OF  COLUMNS  IN  VALIDITY  MATRIX 
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Y  =  29 

C  *  CHOOSE  VALUE  (03  OR  02) 

VAL  *  2 

C  *  VALUE  AFTER  CALCULATION  OF  N  CHOOSE  M  VALUE  (816  OR  153) 
NC  =  153 

C  •  READ  IN  DATA  FILES 

IF  (SOURCE  .£Q.  <PROJA29‘)  THEN 

READ  (5,36)  ((RINV(I.J),  J=1,29),  1-1,29) 

READ  (5,36)  ((V(l,J),  J*1,29),  1=1,18) 

36  FORMAT  (29(1X,F9.6)) 

ENO  IF 

IF  (SOURCE  .EQ.  'PR0JA9')  THEN 

READ  (5,40)  ((R!NV(I,J>,  J=1,9),  1=1,9; 

READ  (5,40)  ((V(!,J),  J=1,9),  1*1,18) 

40  FORMAT  (8(1X,F9.6)/F9.6) 

END  IF 

IF  (SOURCE  .EQ.  'MCGL')  THEN 

READ  (5,40)  ((RINV(I,J),  J*1,9),  1=1,9) 

READ  (5,40)  ((V(I,J),  J=1,9),  1*1,18) 

WRITE  (6,50)  ((RINV(I.J),  J=1,9),  1=1,9) 

WRITE  (6,50)  ((V(I,J),  J=1 ,9),  1=1,18) 

50  FORMAT  (9(1X,F9.6)) 


ENO  IF 

C  *  ZERO  OUT  RESULTS  MATRICES 
DO  120  I  «  1 ,NC 

DO  119  J  *  1,(Y+4) 

RES(I,J)  =  0 
TOTRV(I.J)  =  0 
TRES(J, I )  *  0 

119  CONTINUE 

120  CONTINUE 

IF  (VAL  -EQ.  3)  GO  TO  130 
IF  (VAL  -EQ.  2)  GO  TO  150 

C  *  CALCULA'-:  THE  RESULTS  MATRIX  FOR  CHOOSE  3  VALUE 
C  *  ROW  DESIGNATORS  PLACED  IN  THE  LAST  THREE  COLUMNS 


OF  RESULTS  MATRIX 


130  R1  *  1 

R2  =  R1  ♦  1 
R3  =  R2  ♦  1 
DO  140  I  =  1 ,NC 

CALL  CHECK3(R1 ,R2,R3,X) 

DO  135  J  •  1 ,Y  , 

RES(I.J)  =  (V(R1, J)  ♦  V(R2 *  V(R3,J))/3.0 
135  CONTINUE 

PESO  ,J)  •  R1 
RESO, J*1)  *  R 2 
RES(I , J+2)  =  R3 


R3  *  R3  ♦  1 


140  CONTINUE 

C  *  PRODUCE  THE  OUTPUT  TO  CHECK  PROGRAM 
C  WRITE  (6,145)  ((RES(I,J),  J=1,(Y+VAL)),  1=1, NC) 

C  145  FORMAT  (12(1X,F6.3)) 

GO  TO  175 

CALCULATE  THE  RESULTS  MATRIX  FOR  CHOOSE  2  VALUE 
ROW  DESIGNATORS  PLACED  IN  THE  LAST  TWO  COLUMNS  OF  RESULTS  MATRIX 
R1  *  1 
R2  *  R1  +  1 
DO  170  I  «  1  ,NC 

CALL  CHECK2(R1,R2,X) 

DO  155  J  *  1.Y 

RES(I.J)  =  (V(R1 , J)  ♦  V(R2, J)>/2.0 
CONTINUE 

RES(I.J)  =  R1 
RESO, J+1)  =-  R2 
R2  *  R2  ♦  1 


C 
C 
150 


155 


170  CONTINUE 

C  •  PROOUCE  THE  OUTPUT  TO  CHECK  PROGRAM 
C  WRITE  (6,172)  ((RES(I,J),  J=1,(Y*VAL»,  1=1, NC) 

C  172  FORMAT  (11(1X,F6.3)) 

C  *  FOR  THE  12  JOB  CLUSTERS  CONDITION  THE  ORIGINAL  18  BY  Y  VALIDITY 
C  *  MATRIX  MUST  BE  ADDED  TO  THE  BOTTOM  OF  THE  RESULTS  MATRIX 
175  IF  (NUMCLS  .EQ.  12)  THEN 
P  »  1 
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DO  190  I  =  (NC+1),  (NC+X) 

DO  185  J  =  1.Y 

RES( I, J)  *  V(P,J) 

185  CONTINUE 

RESCI # J)  =  P 
RES(I, J+1)  *  P 
P  *  P+1 

190  CONTINUE 

NC  ■  NC  ♦  18 
ENO  IF 

C  *  TRANSPOSE  RESULTS  MATRIX  (I.E.,  TRANSPOSE  AVERAGE  VALIDITY  MATRIX) 
DO  200  I  *  1  ,NC 
DO  199  J  *  1.Y 

TRES(J.I)  =  RES(I, J) 

199  CONTINUE 

200  CONTINUE 
C 

C  *  THE  NEXT  SECTION  CALCULATES  V  *  R( INVERSE)  *  V(TRANSPOSED) 

C  *  AND  OUTPUT  R2  AS  ANOTHER  COLUMN  IN  THE  RESULTS  MATRIX 
DO  220  I  *  1 ,  NC 
DO  219  P  *  1,Y 
TOTAL  *  0 
DO  217  J  *  1,Y 

MULT  =  RES(I.J)  *  RINV(J,P) 

TOTAL  *  TOTAL  +  MULT 
217  CONTINUE 

TOTRV(l.P)  *  TOTAL 

219  CONTINUE 

220  CONTINUE 

DO  250  I  -  1 ,  NC 
TOTAL  *  0 
DO  228  J  *  1,Y 

MULT  *  TOTRV(I.J)  *  TRES(J.l) 

TOTAL  *  TOTAL  ♦  MULT 
228  CONTINUE 

RES( I , ( Y+VAL+1 ) )  *  TOTAL 
230  CONTINUE 

C  *  PROOUCE  THE  OUTPUT  TO  CHECK  PROGRAM 
C  WRITE  (6.235)  ((RES(I.J),  J=1 . (Y+VAL+1 )),  »=1,NC) 

C  235  FORMAT  (33(1X,F6.3)) 

C 

C  *  THE  NEXT  SECTION  WILL  LOCATE  THE  LARGEST  R2S,  AND  STORE  THE  DATA 
C  *  IN  THESE  ROWS  CORRESPONDING  TO  THE  LARGEST  R2 
C  *  ALL  OTHER  R2  VALUES  ARE  SET  TO  ZERO  WHENEVER  THE  SAME  JOB  #'S 
C  *  HAVE  ALREADY  BEEN  SELECTED 
DO  400  K  «  1 ,  NUMCLS 
LOCLRG  ■  1 
J  «  Y  ♦  VAL  ♦  1 

C  *  FOR  12  CONDITION  MUST  00  SOMETHING  DIFFERENT  WHEN  K  >  6 
IF  (NUMCLS  .EO.  12)  THEN 
IF  (K  .GT.  6)  GO  TO  405 
NC  *  NC-18 
ENO  IF 

C  *  LOCATE  LARGEST  R2  VALUE 
00  260  I  *  2.NC 

IF  (RES(I.J)  .GT.  RES(LOCLRG, J))  THEN 
LOCLRG  *  I 
EM)  IF 

260  CONTINUE 

C  *  CREATE  TEMPORARY  HRES  MATRIX  TO  STORE  THE  EVOLVING  CLUSTERS 
DO  270  L  *  1.J 

HRES(K.L)  *  RES(LOCLRG.L) 

270  CONTINUE 

C  *  SET  R2  VALUES  TO  ZERO  THAT  HAVE  THE  SAME  JOB  DESIGNATORS 
IF  (VAL  .EQ.  3)  GO  TO  300 
IF  (VAL  .EQ.  2)  GO  TO  350 
300  1  «  RES(LOCLRG, ( J-3)) 

P2  *  RES( LOCLRG, (J-2)) 

P3  •  RE  S( LOCLRG , ( J - 1 ) ) 

00  330  I  »  1,NC 

IF  ((PI  .EQ.  RES(I,(J*3)))  .OR.  (PI  .EQ.  RESCI , (J-2)))  .OR. 
♦  (PI  .EQ.  RES(I ,( J-1 )>))  THEN 
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RES(I , J)  =  0 
END  If 

IF  (<P2  .EO.  RES(I,(J‘3)))  .OR.  (P2  .EQ.  RESO ,  (  J-2)))  -OR. 
(P2  .EO.  RES(I,(J-1))))  THEM 
RES( I , J)  =  0 
ENO  IF 

IF  <<P3  .EO.  RES(I,(J-3)))  .OR.  <P3  .EQ.  RES( I , ( J-2)))  .OR. 
(P3  .EO.  RES(I,(J-1>»)  THEM 
RESO.J)  =  0 
ENO  IF 
330  CONTINUE 

GO  TO  400 

350  PI  =  RESCLOCLRG, (J-2)) 

P2  =  R£$(LOCLRG, (J-1)) 

356  IF  (NUMCLS  .EO.  12)  THEN 

NC  =  NC+-18 
END  IF 

DO  380  I  =  1,NC 

IF  ((PI  .EO.  RESO, (J-2)))  .OR.  (PI  .EQ.  RESCI ,  ( J-1)»>  THEN 
RES( I , J)  =  0 
END  IF 

IF  ((P2  .EO.  RES(l,(J-2)))  .OR.  (P2  .EQ.  RES< I , ( J-1 ))))  THEN 
RES( I , J)  =  0 
ENO  IF 

380  CONTINUE 

400  CONTINUE 

C  *  FOR  THE  12  CLUSTER  CONDITION  THE  HIGHEST  R2  FOR  6  SINGLE  JOBS 
C  *  MUST  BE  IDENTIFIED  AND  ORDERED 
405  IF  (NUMCLS  .EO.  12)  THEN 
DO  420  K  =  7,12 
LOCLRG  *  NC-17 
J  =  Y+VAL+1 
DO  410  I  =  (NC-16),NC 

IF  (RES( I , J)  .GT.  RESCLOCLRG, J))  THEN 
LOCLRG  =  I 
END  IF 

410  CONTINUE 

C  *  FOR  THE  12  CLUSTER  CONDITION  FINISH  THE  LAST  6  ROWS  OF  HRES 
C  *  AND  SET  R2  VALUE  TO  ZERO  FOR  THAT  ROW 
DO  415  L  =  1.J 

HRES(K.L)  =  RESCLOCLRG, L) 

415  CONTINUE 

RES(LOCLRG,J)  =  0 
420  CONTINUE 
END  IF 

C  *  WRITE  OUT  INITIAL  CLUSTERS  (STORED  IN  HRES) 

C  WRITE  (6,422) 

C  422  FORMAT  (/IX,  'INITIAL  CLUSTERS') 

C  IF  (VAL  .EQ.  3)  THEN 
C  IF  (SOURCE  .EQ.  'PROJA29' )  THEN 

C  WRITE  (6,424)  ((HRES(I,J),  J=1,33),  1=1, NUMCLS) 

C  424  FORMAT  (33(1X,F5.2)) 

C  GO  TO  440 

C  ENO  IF 

C  WRITE  (6,426)  ((HRESO , J),  J=1,13),  1=1, NUMCLS) 

C  426  FORMAT  (13(1X,F6.3)) 

C  END  IF 

C  IF  (VAL  .EO.  2)  THEN 

C  IF  (SOURCE  .EQ.  'PROJA29' )  THEN 

C  WRITE  (6,428)  ((HRESO, J),  J=1,32),  1=1, NUMCLS) 

C  428  FORMAT  (32(1X,F5.2)) 

C  GO  TO  440 

C  END  IF 

C  WRITE  (6,430)  ((HRESO, J),  J=1,12),  1=1, NUMCLS) 

C  430  FORMAT  (12(1X,F6.3)) 

C  END  IF 


C . C 

C  STAGE  2  c 

C  STAGE  2  WILL  SHRED  OUT  THE  INITIAL  SET  OF  CLUSTERS  TO  C 

C  DETERMINE  IF  THERE  IS  A  MORE  OPTIMAL  COMBINATION  OF  JOBS  C 

C  THAN  THE  INITIAL  CORE  CLUSTERS.  C 

C . C 

c 
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C  *  CREATE  MATRIX  NEURES  THAT  IS  THE  SAME  AS  THE  HRES  MATRIX 
C  *  BUT  WITH  THE  COLUMNS  AT  END  OF  MATRIX  IN  DIFFERENT  PLACES. 

C  *  THIS  GIVE  UNLIMITED  SPACE  TO  STORE  THE  JOB  DESIGNATORS. 

C  *  R2  VALUE  STORED  IN  COLUMN  10  (OR  30) 

C  *  A  COUNTER  VALUE  STORED  IN  COLUMN  11  (OR  31) 

C  *  A  “SELECTED"  DESIGNATOR  STORED  IN  COLUMN  12  (OR  32) 

C  *  JOB  DESIGNATORS  ARE  STORED  IN  COLUMNS  13  (OR  33)  ON. 

C  *  STORE  DATA  VALUES  FIRST 
440  DO  450  I  =  1 , NUMCLS 
DO  449  J  =  1,Y 

NEWRES(I,J)  =  HRESd.J) 

449  CONTINUE 

450  CONTINUE 

C  *  STORE  R2  AND  INITIALIZE  "SELECTED"  VALUE  TO  ZERO 
DO  455  I  =  1, NUMCLS 

NEWRES( I ,(Y+1 ))  =  HRES ( I , ( Y+VAL+1 ) ) 

NEURESd ,  (Y+3)>  =  0 
455  CONTINUE 

C  *  STORE  JOB  DESIGNATORS  AND  CREATE  COUNTER 
IF  (VAL  .EG.  3)  THEN 
DO  460  I  «  1, NUMCLS 

NEWRES(t,(Y+4))  *  HRES(I,(Y+1» 

NEURES( I , (Y+5))  =  HRES(I,(Y+2)) 

NEURESd ,  (Y+6))  *  HRES(I,(Y+3)) 

NEWR£S(I,(Y-*2))  =  3.0 
460  CONTINUE 

END  IF 

IF  (VAL  .EG.  2)  THEN 
DO  470  I  -  1, NUMCLS 

NEWRES(I,(Y+4))  =  HRESd  ,(Y+1 )) 

NEWRES(I , (Y*5))  =  HRES(I,(Y+2)) 

NEURES( I , (  Y+2) )  =  2.0 
470  CONTINUE 

END  IF 

IF  (NUMCLS  .EO.  12)  THEN 
DO  475  I  =  7,12 

NEWRESd ,  (Y+2))  =  1.0 
NEWRES(I,(Y+3))  =  1.0 
475  CONTINUE 

END  IF 
WRITE  (6,483) 

483  FORMAT  (/IX,  'NEWRES  MATRIX') 

WRITE  (6,485)  ((NEWRESd , J),  J=(Y+1),(Y+VAL+3)),  1=1, NUMCLS) 
485  FORMAT  (5(1X,F6.3)) 

C 

C  *  CALCULATE  THE  INITIAL  AVERAGE  WEIGHTED  R2  VALUE  FOR  NEWRES 
CALL  AVGUR2(NEWRES,ORGVAL, Y+1, Y+2, NUMCLS) 

WRITE  (6,490)  ORGVAL 

490  FORMAT(/1X,  'INITIAL  R2  VALUE  FOR  NEWRES: ' , 1X.F8.6) 

C 

C  *  INITIALIZE  THE  ITERATION  COUNTER 
G  =  0 
C 

C  *  INITIALIZE  Ml,  M2,  M3,  TEMPI,  TEMP2,  TEMP3 
500  DO  505  I  =  1, NUMCLS 

DO  504  J  *  1 , (Y+16) 

MI(I.J)  =  0 
M2(I, J)  =  0 
M3( I , J)  =  0 
TEMPI(I.J)  =  0 
TEMP2(I,J)  =  0 
TEMP3(I, J)  =  0 

504  CONTINUE 

505  CONTINUE 

C  *  INITIALIZE  T0TRV1 ,  TOTRV2,  T0TRV3  TO  BE  USED  IN  CALC  OF  R2 
DO  510  I  *  1, NUMCLS 
DO  509  J  *  1,Y 
TOTRV1(I,J)  *  0 
TOTRV2(I,J)  =  0 
TOTRV3d,  J)  =  0 

509  CONTINUE 

510  CONTINUE 
C 
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C  *  CHOOSE  THE  ROW  FROM  NEURES  TO  BE  COMBINED  (ONE  JOB  AT  A  TIME) 

C  *  WITH  THE  OTHER  ROUS 

CROW  =  LOWR2(NEWRES,NUMCLS, (Y+1 )) 

C 

C  *  THE  NEXT  STATEMENT  PROVIDES  THE  LOOP  TO  END  THE  PROGRAM  IF  ALL 
C  *  ROUS  THAT  ARE  ELIGIBLE  TO  BE  SELECTED  HAVE  BEEN  SELECTED 
IF  (CROU  .GT.  0)  THEN 
G  =  G  +  1 
C 

C  *  SET  THE  "SELECTED"  VALUE  TO  1  FOR  THE  CROU  JUST  CHOSEN 
NEWRE$(CROU,(Y+3))  =  1.0 
C  *  OBTAIN  THE  COUNTER  VALUE  FOR  THIS  ROU 
NUMCOL  «  I NT (NEURES(CROU, (Y+2) ) ) 

C  •  OBTAIN  THE  JOB  DESIGNATORS  FOR  THIS  ROU 
DO  530  1-1, NUMCOL 

CJOB(I)  *  INT(NEWRES(CROU,(Y+3+I))) 

530  CONTINUE 
C 

C  *  CREATE  THE  TEMPORARY  MATRICES  (Ml, M2, M3) 

RN  «  0 

DO  550  1  =  1.NUMCLS 

IF  (I  .NE.  CROU)  THEN 
RN  «  RN  ♦  1 

HOUMNY  =  INT(NEURES( I , (Y+2))) 

DO  540  K  =  1, HOUMNY 

ROU(K)  *  INT(NEURES(I,(Y+3+K))) 

540  CONTINUE 

IF  (NUMCOL  .EQ.  2)  THEN 

CALL  CALC2( V, C JOB, ROU, HOUMNY, NUMCOL ,Y,RN, Ml ,M2) 

END  IF 

IF  (NUMCOL  .EQ.  3)  THEN 

CALL  CALC3(V,CJ0B,R0U,H0UMNY, NUMCOL, Y,RN, Ml, M2, H3) 
ENO  IF 
END  IF 

550  CONTINUE 
C 

C  *  TRANSPOSE  Ml,  M2,  AND  M3  TO  BE  USED  IN  THE  CALC  OF  R2 
DO  570  I  =  1 , (NUHCLS- 1 ) 

DO  569  J  =  1,Y 

TMI(J.I)  *  HI ( I , J ) 

TM2( J, I )  =  M2(I,J) 

IF  (NUMCOL  -EQ.  2)  GO  TO  569 
TM3( J, I )  =  M3(I , J) 

569  CONTINUE 

570  CONTINUE 

C  *  CALCULATE  R2  VALUE  FOR  Ml,  M2,  M3  (ONLY  CALCULATES  R2  FOR  M3  IF 
C  *  NUMCOL  EQUALS  3) 

DO  650  I  *  1,(NUMCLS-1) 

DO  645  P  =  1,Y 
TOTAL 1  *  0 
TOTAL2  =  0 
TOTAL3  =  0 
DO  600  J  =  1,Y 

MULTI  =  Ml ( I , J)  *  RINV( J,P) 

TOTAL 1  =  TOTAL1  +  MULTI 
600  CONTINUE 

TOTRVI(I.P)  *  TOTAL 1 
DO  620  J  =  1,Y 

MULT2  *  M2(I,J)  *  RINV( J,P) 

TOTAL2  *  TOTAL2  ♦  MULT2 
620  CONTINUE 

TOTRV2(I,P)  *  TOTAL2 
IF  (NUMCOL  .EQ.  2)  GO  TC  645 
DO  630  J  *  1,Y 

MULT3  =  M3(I , J)  *  RINV(J,P) 

TOTAL3  =  T0TAL3  ♦  MULT3 
630  CONTINUE 

TOTRV3(I,P)  *  TOTAL3 
645  CONTINUE 
650  CONTINUE 

DO  730  I  *  1,(NUMCLS-1) 

TOTAL 1  «  0 
TOTAL2  »  0 
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T0TAL3  =  0 
00  710  J  =  1,Y 

MULTI  =  TOTRV1(I,J)  *  TMl(J.I) 

TOTAL 1  =  TOTAL 1  ♦  MULTI 
MULT2  =  TOTRV2(I, J)  *  TM2(J.I) 

TOTAL2  =  TOTAL2  +  MULT2 
IF  (NUMCOL  .EO.  2)  GO  TO  710 
MULT3  *  TOTRV3(I,J)  *  TN3(J,I) 

TOTAL3  *  TOTAL3  +  MULT3 
710  CONTINUE 

M1(I.(Y«-1))  =  TOTAL  1 
M2(It(Y*1))  =  TOTAL2 
IF  (NUMCOL  .EQ.  2)  GO  TO  730 
M3(I,(Y+1))  *  TOTAL3 
730  CONTINUE 

WRITE  (6,735)  G 

735  FORMAT  (/IX,  'ITERATION  NUMBER:',  IX,  12) 

C  *  WRITE  OUT  COUNTERS  AND  JOB  NUMBERS  FOR  Ml,  M2,  AND  M3 
C  *  TO  CHECK  THE  PROGRAM 
WRITE  (6,770) 

770  FORMAT  (/IX,  'Ml') 

WRITE  (6,772)  «M1(I,J),  J=(Y+1).<Y+16)),  I=1,(NUMCLS-1)) 

772  FORMAT  (16(1X,F6.3)) 

WRITE  (6,774) 

774  FORMAT  (/IX,  'M2') 

WRITE  (6,772)  <(M2(I,J),  J=(Y+1),(Y+16)),  I=1,(NUMCLS-1)> 

IF  (NUMCOL  .EQ.  2)  GO  TO  800 
WRITE  (6,776) 

776  FORMAT  (/IX,  'M3') 

WRITE  (6,772)  ((M3(I,J),  J=(Y+1),(Y+16)),  I=1,(NUHCLS-1)) 

C 

C  *  THE  NEXT  SECTION  PULLS  OUT  THE  HIGHEST  R2  FROM  Ml,  M2,  M3, 

C  *  SUBSTITUTES  THE  APPROPRIATE  ROWS  TO  CREATE  A  NEW  “TEMP'* 

C  *  MATRIX,  CALCULATES  THE  AVERAGE  WEIGHTED  R2  FOR  EACH  TEMP  MATRIX  AND 
C  *  PICKS  THE  HIGHEST  WEIGHTED  AVERAGE  COMBINATION  AS  THE  NEW  "NEWRES" 
800  IF  (NUMCOL  .EQ.  2)  THEN 

CALL  HIGHR2(M1 ,PROW1 ,RN,Y*1 ) 

CALL  HIGHR2(M2,PROW2,RN,Y+1) 

CALL  COPYMX(NEWRES, TEMPI, Y, NUMCLS) 

CALL  COPYMX(NEWRES,TEMP2,Y,NUMCLS) 

CALL  CREAT2(V,RINV,M1 , M2, PROW1 ,PROW2,C JOB, Y, NUMCOL, NUMCLS, 

♦  Y+16, CROW, TEMPI, TEMP2) 

CALL  AVGWR2(TEMP1 .NEWVLl , Y+1 ,Y+2, NUMCLS) 

CALL  AVGWR2(TEMP2 , NEWVL2 , Y*1 . Y+2 , NUMCLS) 

C  *  WRITE  OUT  TEMP  MATRICES  AND  WEIGHTED  R2  VALUES  TO  CHECK  PROGRAM 
WRITE  (6,820) 

820  FORMAT  (/IX,  'TEMPORARY  MATRIX  1') 

WRITE  (6,822)  ((TEMPKI.J),  J=(Y+1),(Y+16)),  1=1, NUMCLS) 

822  FORMAT  (16(1X,F6.3)) 

WRITE  (6,824) 

824  FORMAT  (/IX,  'TEMPORARY  MATRIX  2') 

WRITE  (6.822)  «TEMP2(I, J),  J=(Y+1),(Y+16)),  1=1, NUMCLS) 

WRITE  (6,826)  .EWVL1 

826  FORMAT  (/IX,  'TEMPI  WEIGHTED  R2:',  IX,  F8.6) 

WRITE  (6,828)  NEWVL2 

828  FORMAT  (/IX,  'TEMP 2  WEIGHTED  R2:',  IX,  F8.6) 

C  *  COMPARE  WEIGHTED  R2  VALUES  WITH  ORIGINAL  WEIGHTED  R2 
C  *  MATRIX  WITH  THE  HIGHEST  WEIGHTED  R2  BECOMES  THE  NEW  "NEWRES" 

IF  (NEWVLl  .GT.  ORGVAL)  THEN 

CALL  COPYMX(TEMP1, NEWRES, Y, NUMCLS) 

ORGVAL  =  NEWVLl 
ENO  IF 

IF  (NEWVL2  .GT.  ORGVAL)  THEN 

CALL  COPYMX(TEMP2, NEWRES, Y, NUMCLS) 

ORGVAL  *  NEWVL2 
Elffi  IF 

WRITE  (6,850) 

850  FORMAT  (/IX,  'THE  NEW  SET  OF  JOB  CLUSTERS') 

WRITE  (6,852)  ((HEWRES( I , J),  J=(Y+1),(Y+16)),  1=1, NUMCLS) 

852  FORMAT  (16(1X,F6.3)) 

WRITE  (6,854)  ORGVAL 

854  FORMAT  (/IX,  'WEIGHTED  R2  EQUALS:',  IX,  F8.6) 

GO  TO  500 
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END  IF 

IF  (NUMCOL  .EQ.  3)  THEN 

CALL  NIGHR2(M1,PROW1,RN,Y*1) 

CALL  H I GHR2(M2, PROW2 , RN , Y+1 ) 

CALL  HIGHR2(M3,PROW3,RN,Y+1> 

CALL  COPYHX(NEWRES, TEMPI, Y, NUHCLS) 

CALL  COPYHX(NEWRES , TEMP2 , Y , NUHCLS ) 

CALL  COPYMX(NEWRES,TEMP3,Y, NUHCLS) 

CALL  CREAT3(V,RINV,M1 ,M2,M3,PROW1 ,PROW2,PROW3,CJOB,Y, 

♦  NUMCOL.NUHCLS, Y+16, CROW, TEMPI, TEHP2.TEMP3) 

CALL  AVGWR2(TEMP1 .NEWVL1 ,Y+1 .  Y+2, NUHCLS) 

CALL  AVGWR2 ( TEMP2 , NEWVL2 .Y+1.Y+2, NUHCLS ) 

CALL  AVGWR2 ( TEMP3 , NEWVL3 , Y* 1 , Y+2 , NUHCLS) 

C  *  WRITE  OUT  TEMP  MATRICES  AND  WEIGHTED  R2  VALUES  TO  CHECK  PROGRAM 
WRITE  (6,880) 

880  FORMAT  (/IX,  'TEMPORARY  MATRIX  1') 

WRITE  (6,882)  ((TEMPI (I, J),  J=(Y*1),(Y+16)),  I=1,NUMCLS) 
882  FORMAT  (16(1X,F6.3)) 

WRITE  (6,884) 

884  FORMAT  (/IX,  'TEMPORARY  MATRIX  2') 

WRITE  (6,882)  ((T£MP2(I, J),  J=(Y+1),(Y+16)),  1=1, NUHCLS) 
WRITE  (6,886) 

886  FORMAT  (/IX,  'TEMPORARY  MATRIX  3') 

WRITE  (6,882)  ((TEMP3(I, J),  J=(Y+1),(Y*16)),  I=1,NUMCLS) 
WRITE  (6,888)  NEWVL1 

888  FORMAT  (/IX.  'TEMPI  WEIGHTED  R2:',  IX,  F8.6) 

WRITE  (6,890)  NEWVL2 

890  FORMAT  (/IX,  'TEMP2  WEIGHTEO  R2:\  IX,  F8.6) 

WRITE  (6,900)  NEWVL3 

900  FORMAT  (/IX.  'TEMP3  WEIGHTED  R2:',  IX,  F8.6) 

C  *  COMPARE  WEIGHTED  R2  VALUES  WITH  ORIGINAL  WEIGHTED  R2. 

C  *  MATRIX  WITH  THE  HIGHEST  WEIGHTEO  R2  BECOMES  THE  NEW  “NEWRES". 

WRITE  (6,910)  ORGVAL 
910  FORMAT  (/IX,  'ORGVAL  '.  F8.6) 

IF  (NEWVL1  -GT.  ORGVAL)  THEN 

CALL  C0PYMX(TEMP1, NEWRES, Y, NUHCLS) 

NEWRES(CR0W,(Y+3))  =  0.0 
ORGVAL  =  NEWVLl 
END  IF 

IF  (NEWVL2  .GT.  ORGVAL)  THEN 

CALL  C0PYMX(TEMP2, NEWRES, Y, NUHCLS) 

NEWRES(CR0W,(Y*3})  =  0.0 
ORGVAL  *  NEWVL2 
END  IF 

IF  (NEWVL3  .GT.  ORGVAL)  THEN 

CALL  COP YMX(TEMP3, NEWRES, Y, NUHCLS) 

NEWRES(CR0W,(Y+3))  =  0.0 
ORGVAL  =  NEWVL3 
END  IF 

WRITE  (6,950) 

950  FORMAT  (/IX,  'THE  NEW  SET  OF  JOB  CLUSTERS') 

WRITE  (6,952)  ((NEWRES(I.J),  J=(Y-*1),(Y+16)),  1=1, NUHCLS) 
952  FORMAT  (16(1X,F6.3)) 

WRITE  (6,954)  ORGVAL 

954  FORMAT  (/IX,  'WEIGHTED  R2  EQUALS:',  IX,  F8.6) 

GO  TO  500 
END  IF 
ENO  IF 
STOP 
END 
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APPENDIX  I 


PSEUDO-RANDOM  NUMBER  GENERATOR 


l.  FORTRAM  SOURCE  CODE  (Pavia  Johnson.  October,  1989)1 

PROGRAM  GENX2 
JINCLUDE:  RANDOM-INC' 

INTEGER  I.J.P 

C  *  OECtARE  NORMAL  •RANDOM'  FUNCTION 
REAL  NORMAL 
EXTERNAL  NORMAL 

C  *  OECLARE  FLOOR  (LARGEST  INTEGER  LESS  THAN)  FUNCTION 
INTEGER  FLOOR 
EXTERNAL  FLOOR 
REAL  X.Y 
REAL  SUM(1:6) 

INTEGER  SAMPLE(-100:100),OUT,IN 
DATA  SAMPLE/201  *0/ 

C  *  INIT1AUZERANDOM  NUMBER  GENERATOR  INTERNAL  DATA  STRUCTURES 

CALL  RANDINIT 

C  ‘CREATE  OUTPUT  FILE 
OPEN(7flLE«XX.DAT) 

C  *  GENERATE  X  MATRIX  OF  1 7  BY  NN 
CALL  GENX 
STOP 
END 

REAL  FUNCTION  NORMALO 

C - - 

C 

C  FUNCTION:  NORMAL 

C  PURPOSE:  GENERATE  A  NORMALY  GENERATED  RANDOM  NUMBER  BY  SUMMING 
C  UNIFORMLY  DISTRIBUTED  NUMBERS. 

C  CALLINGSYNTAX: 

C  REAL  NORMAL 

C  EXTERNAL  NORMAL 

C  X-NORMALO 

C  where: 

C  X  RECEIVES  A  “NORMALLY-SELECTED  VALUE 

C 

C - * 

JINCLUDE:  RANDOM.INC' 

INTEGER  I 
REAL  RANDOM 
EXTERNAL  RANDOM 
REAL  NORMALIZE{0:39) 

DATA  NORMALIZE/-2.31 82,  -1.8123,  -1.5509,  -1.3676,  -1.2222. 

♦  -1.0988. -0.9904, -0.8924, -0.8020. -0.7177. 

♦  -0.6883, -0.5627, -0.4901, -0.4202, -0.4523, 

♦  0.2860,  02196,  0.1568,  0.0934,  0.0304. 

♦  *0.0304,  *0.0934,  *0.1568,  *02196,  *0  2860, 

♦  *0.4523,  *0.4202,  *0.4901.  *0.5627,  *0.6883, 

♦  *0.7177,  *0.8020,  *0.8924,  *0.9904,  *1.0988, 

♦  *1.2222,  *1.3676,  *1.5509.  *1.8123,  *2.3182/ 


NORMAL  =0.0 
IF(TFLAG.EQ.1)THEN 
DO  1  U1.NUNIFORM 
NORMAL  *  NORMAL  ♦  RANDOM  0 
1  CONTINUE 
ELSE 


1 


Algorithm  from  Park  and  Miller  (1988);  rultipliers  from  fishman  and  Moore  (1986). 


1-1 


OOOOOOO  O  O  iiOOOOOOOOOOOOO 


DO  2  l=1,NUNIF0RM 

NORMAL=NORMAL*NORMALIZE<INT<RANDOMO/O.025)) 
2  CONTINUE 
END  IF 

NORMAL=(NORMAL-OFFS£T)  'SCALE 

RETURN 

END 

REAL  FUNCTION  RANDOM 
C - ► 


FUNCTION:  RANDOM 

PURPOSE:  GENERATE  A  UNIFORMLY  DISTRIBUTED  RANDOM  NUMBER  USING  THE 
ALGORITHIM  PROPOSED  IN  “RANDOM  NUMBER  GENERATORS:  GOOD 
ONES  ARE  HARD  TO  FIND”  CACM,  1 0/86 
CALLING  SYNTAX: 

REAL  RANDOM 
EXTERNAL  RANDOM 
X-RANDOMO 
where 

X  RECEIVES  A  *RANDOMLY“SELECTED.  WHERE  0.0<=X<1.0 


VCLUDE:  ’RANDOM.INC' 

REAL'S  MODULUS 

PARAMETER  (M00ULUS=2  147  483  647.000) 

SEED(S,M)»OMOD(SEED(S,M)*MULT(M). MODULUS) 
RANDOM*SEED(S,M)/MODULUS 

*  SELECT  NEXT  MULTIPLIER 
M=MOD(M-1.NMULTS) 

•  SELECT  NEXT  SEED  SEQUENCE 
IF(M.EQ.O)  S*MOO(S+1  .NSEEDS) 

RETURN 

END 

SUBROUTINE  RANDINIT 


SUBROUTINE:  RANDINIT 

PURPOSE:  INITIAIZE  DATA  STRUCTURES  REQUIRED  BY  THE  RANDOM  NUMBER 
GENERATION  ROUTINES. 

SYNTAX:  CALL  RANDINIT 


C - - 

REAL'8  MODULUS 

PARAMETER  (M00ULUS.2  147  483  647.000) 

INTEGER  I.J 
REAL'8  LSEED 
SINCLUOE:  •RANDOM.INC’ 

C  •  PROGRAMMINGNOTE:  THIS  IS  AN  EXAMPLE  OF  HOW  TO  DO  A  •WHILE’  LOOP, 
C  ‘IN  THIS  CASE,  WHILE  NMULTS=0  DO...’ 

NMULTS-0 

1  IF(NMULTS.LT.1  .OR.  NMULTS.GT.MAXMULTS)THEN 

WRITE(6,'r  USING  A  DIFFERENT  MULTIPLIER  RESULTS  IN")1) 

WRrrE(6,'r  A  DIFFERENT  RANDOM  NUMBER  SEQUENCE.")-) 

WRITE  (6, 

♦  ’("  ENTER  NUMBER  OF  DIFFERENT  MULTIPLIERS  TO  USE:  ”$)•) 
READ{5,'(I3)‘)NMULTS 

GOTO  1 
ENO  IF 

NUNIFORM*0 

2  IF(NUNIF0RM.LT.1  .OR.  NUNIFORM,GT.200)THEN 

WRmE(6,’("  NORMAL  DEVIATES  ARE  PRODUCED  BY  SUMMINGA")') 

WRITE  (6, 

♦  T  NUMBER  OF  UNIFORM  DEVIATES.  THIS  NUMBER  MAY  BE")’) 

WRITER, •("  THE  SAME  AS  THE  NUMBER  OF  MULTIPLIERS")’) 

WRITE (6,'f  ENTER  NUMBER  OF  UNIFORM  DEVIATES  TO  USE:  '•$)•) 
READ(5.’(l3nNUNIFORM 

GOTO  2 
ENO  IF 
NSEEDS«0 
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3  IF(NS£EDS.LT.1  .OR.  NSEEDS  .GT.  MAXSEEDS)THEN 

WR1TE(6.'("  USING  A  DIFFERENT  SEED  RESULTS  IN  THE  RANDOM")’) 
WRfTE(6,T’  SEQUENCE  STARTING  AT  A  DIFFERENT  POINT.")1) 
WRITE(6,'r  ENTER  THE  NUMBER  OF  SEEDS  PER  MULTIPLIER:  "$)•) 
READ(5, '(LINSEEDS 
GOTO  3 
END  IF 
TFLAG«0 

4  IFfTFLAG.LT.I  .OR.  TFLAG.GT.2JTHEN 

WRITE  (6, 

♦  T’  THE  UNIFORMLY  DEVIATES  MAY  BE  TRANSFORMED  INTO")1) 
WR(TE(6, 

♦  ’(’’  NORMAL  DEVIATES  BY  A  TABLE  LOOKUP  PROCESS  BEFORE”)1) 
WRITEGT  SUMMING.’’)’) 

WRITE(6, 

♦  •(”  ENTER  1  TO  DISABLE  TRANSFORMATION.2  TO  ENABLE:  "*)') 
READ(5,*)TFLAG 

GOTO  4 
END  IF 

IF(TFLAG.EQS)THEN 
OFFSET «0 

SCALE  «SQRT(NUNIFORM)/NUNIFORM 
ELSE 

OFFSET-NUNIFORM/2.0 
SCALE  *SORT(12.0/NUNIFORM) 

END  IF 

C  *  INITIALIZE  SEED  ARRAY  USING  A  ’PRIVATE’  GENERATOR 

LSEED-0 

5  IF(LSEED.EQ.0)THEN 

WRITE(6,T  ENTER  THE  INIRALIZARONSEED:  "$)■) 

READ(5,*)LSEED 
GOTO  5 
END  IF 

DO  7  l«0,NSEEDS-1 
DO  8  J-O.NMULTS-1 

LSEED*DMOO(LSEED  * 1 7043 1 822000 .MODULUS) 

SEED(I,J)=LSEED 

6  CONTINUE 

WRITE(6,T  ”.10F1 2.0)’)(SEED(I,J).J=0,NMULTS-1 ) 

7  CONTINUE 

WRITE<6.’(’’  NEXT  INIRALIZATIONSEED:  ”.F12.0)’)LSEED 

RETURN 

ENO 

INTEGER  FUNCRON  FLOOR(X) 

REALX 

IF(X.LT.0.0)THEN 

FLOOR=INT(X)-1 

ELSE 

FLOOR*INT(X) 

END  IF 

RETURN 

END 

C  *  SUBROURNE  TO  GENERATE  X  MATRIX  OF  RND'S 
SUBROUTINE  GENX 
INTEGER  I.J.NN 
REAL  XSAMPLE(30,600) 

REAL  NORMAL 
EXTERNAL  NORMAL 
SINCLUDE:  RANDOM. INC' 

WRITE(6,’C  ENTER  SAMPLE  SIZE:  "S)1) 

READ(5,*)NN 
DO  2  U1.NMULTS 
DO  1  J*1.NN 
XSAMPL£(l,J)=NORMAL0 

1  CONRNUE 

2  CONRNUE 

C  WRITE  (7,*)NN 

WRITE (7.*)  <(XSAMPLE(I,J),J»1,NN),M,20) 

RETURN 

END 


BLOCK  DATA 
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{INCLUDE:  'RANDOM. INC’ 

DATAMULT/1493834601D0, 103756696000,  74372248600,1 509089937D0, 

♦  1 56769947600, 1 94730690  7D0, 1 076532097D0, 1 9578 1 1 727D0, 

♦  6284671 4800,1 040895393D0.  786824435D0,  55653082400, 

♦  87921 29000, 1 45791 3431  DO,  385787459D0.1 56731 653200, 

♦  930959341  D0,1 58881 346500,103551921900,  369442 45D0, 

♦  189135697300,189741229200,75468073900,197120481200, 

♦  188884779800,157164163400,111743555400,56917066200, 

♦  92740725900,149069026700,  23571697700, 14928962500, 

♦  16605761 2900, 151 72661 8700,1229881 01 200,  707656279D0, 

♦  186909573400,99556046400,53914626800,160418717900, 

♦  20621 5022000,  37059472400 .2044924591  DO,  91 61 0078700, 

♦  103741412600,183812241000,126543846400,100780470900, 

♦  125743187900,206174969700,  73700977400,  40843274000, 

♦  876389446D0.1 29471 178600,  96514640400,  73715401700, 

♦  76497060600, 1074109599D0.103921924700, 42864164400, 

♦  152285668600,101905471400,80587472700,116569949100, 

♦  25888037500,1 55428363 7D0.1 1 5586257900,  84839676000, 

♦  91589250700,  614779685D0,  391842496D0,  38000681000, 

♦  201 176925100, 1860139263D0, 192059708800, 199341295800, 

♦  51 1 80682300,  9791 6789700,1 95680642200,1 25690970800, 

♦  58148868200,  334258581  DO,  68580478D0.  53489794400, 

♦  25167634000,1 051 07252800,21 01 65523400,1 41 3698051  DO. 

♦  79632234100,69810884600.154424945600,85701018800, 

♦  1 860488201  DO,  3553891 0500.1 77472244900,1 5824051 1 7D0, 

♦  55346974100,141100776700,123010254500,35626747800, 

♦  77808466300,190501441700,1 10987133000, 1704318220DO, 

♦  27059373800,  48338911100,  32312801300,  361076890DO/ 
DATASEE0/S8YM*1 .0/ 

DATA  S/0/ 

DATA  M/0/ 

END 
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TABLE  1-1:  INITIALIZATION  SEEDS  USED  TO  GENERATE  RANDOM 

NORMAL  DEVIATES  FOR  20  CROSS-SAMPLES  IN  DESIGN  A 
(XX1-XX20) 


Cross-Sample 

Initialization  Seed 

XXI 

2102089753 

XX2 

1396324989 

XX3 

594201671 

XX4 

1049251362 

XX5 

195748861 

XX6 

2143136572 

XX7 

160875454 

XX8 

851439770 

XX9 

126617071 

XX10 

1318897636 

XXI 1 

514694161 

XX12 

1410932621 

XX13 

603731346 

XX14 

1410147358 

XXI 5 

706848193 

XX16 

340061464 

XX17 

1218029222 

XX18 

1037466748 

XX19 

983209347 

XX20 

2067888841 
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